This notebook is a template with each step that you need to complete for the project.
Please fill in your code where there are explicit ? markers in the notebook. You are welcome to add more cells and code as you see fit.
Once you have completed all the code implementations, please export your notebook as a HTML file so the reviews can view your code. Make sure you have all outputs correctly outputted.
File-> Export Notebook As... -> Export Notebook as HTML
There is a writeup to complete as well after all code implememtation is done. Please answer all questions and attach the necessary tables and charts. You can complete the writeup in either markdown or PDF.
Completing the code template and writeup template will cover all of the rubric points for this project.
The rubric contains "Stand Out Suggestions" for enhancing the project beyond the minimum requirements. The stand out suggestions are optional. If you decide to pursue the "stand out suggestions", you can include the code in this notebook and also discuss the results in the writeup file.
Below is example of steps to get the API username and key. Each student will have their own username and key.
kaggle.json and use the username and key.
ml.t3.medium instance (2 vCPU + 4 GiB)Python 3 (MXNet 1.8 Python 3.7 CPU Optimized)!pip install -U pip
!pip install -U setuptools wheel
!pip install -U "mxnet<2.0.0" bokeh==2.0.1
!pip install autogluon --no-cache-dir
# Without --no-cache-dir, smaller aws instances may have trouble installing
Requirement already satisfied: pip in /usr/local/lib/python3.7/site-packages (21.3.1)
Collecting pip
Using cached pip-22.3-py3-none-any.whl (2.1 MB)
Installing collected packages: pip
Attempting uninstall: pip
Found existing installation: pip 21.3.1
Uninstalling pip-21.3.1:
Successfully uninstalled pip-21.3.1
Successfully installed pip-22.3
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Requirement already satisfied: setuptools in /usr/local/lib/python3.7/site-packages (59.4.0)
Collecting setuptools
Using cached setuptools-65.5.0-py3-none-any.whl (1.2 MB)
Collecting wheel
Using cached wheel-0.37.1-py2.py3-none-any.whl (35 kB)
Installing collected packages: wheel, setuptools
Attempting uninstall: setuptools
Found existing installation: setuptools 59.4.0
Uninstalling setuptools-59.4.0:
Successfully uninstalled setuptools-59.4.0
Successfully installed setuptools-65.5.0 wheel-0.37.1
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Collecting mxnet<2.0.0
Using cached mxnet-1.9.1-py3-none-manylinux2014_x86_64.whl (49.1 MB)
Collecting bokeh==2.0.1
Using cached bokeh-2.0.1-py3-none-any.whl
Requirement already satisfied: pillow>=4.0 in /usr/local/lib/python3.7/site-packages (from bokeh==2.0.1) (8.4.0)
Requirement already satisfied: typing-extensions>=3.7.4 in /usr/local/lib/python3.7/site-packages (from bokeh==2.0.1) (4.0.1)
Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.7/site-packages (from bokeh==2.0.1) (2.8.2)
Requirement already satisfied: tornado>=5 in /usr/local/lib/python3.7/site-packages (from bokeh==2.0.1) (6.1)
Requirement already satisfied: packaging>=16.8 in /usr/local/lib/python3.7/site-packages (from bokeh==2.0.1) (21.3)
Requirement already satisfied: Jinja2>=2.7 in /usr/local/lib/python3.7/site-packages (from bokeh==2.0.1) (3.0.3)
Requirement already satisfied: PyYAML>=3.10 in /usr/local/lib/python3.7/site-packages (from bokeh==2.0.1) (5.4.1)
Requirement already satisfied: numpy>=1.11.3 in /usr/local/lib/python3.7/site-packages (from bokeh==2.0.1) (1.19.1)
Requirement already satisfied: requests<3,>=2.20.0 in /usr/local/lib/python3.7/site-packages (from mxnet<2.0.0) (2.22.0)
Requirement already satisfied: graphviz<0.9.0,>=0.8.1 in /usr/local/lib/python3.7/site-packages (from mxnet<2.0.0) (0.8.4)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.7/site-packages (from Jinja2>=2.7->bokeh==2.0.1) (2.0.1)
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /usr/local/lib/python3.7/site-packages (from packaging>=16.8->bokeh==2.0.1) (3.0.6)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/site-packages (from python-dateutil>=2.1->bokeh==2.0.1) (1.16.0)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.7/site-packages (from requests<3,>=2.20.0->mxnet<2.0.0) (1.25.11)
Requirement already satisfied: idna<2.9,>=2.5 in /usr/local/lib/python3.7/site-packages (from requests<3,>=2.20.0->mxnet<2.0.0) (2.8)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /usr/local/lib/python3.7/site-packages (from requests<3,>=2.20.0->mxnet<2.0.0) (3.0.4)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/site-packages (from requests<3,>=2.20.0->mxnet<2.0.0) (2021.10.8)
Installing collected packages: mxnet, bokeh
Attempting uninstall: bokeh
Found existing installation: bokeh 2.4.2
Uninstalling bokeh-2.4.2:
Successfully uninstalled bokeh-2.4.2
Successfully installed bokeh-2.0.1 mxnet-1.9.1
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Collecting autogluon
Downloading autogluon-0.5.2-py3-none-any.whl (9.6 kB)
Collecting autogluon.tabular[all]==0.5.2
Downloading autogluon.tabular-0.5.2-py3-none-any.whl (274 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 274.2/274.2 kB 104.9 MB/s eta 0:00:00
Collecting autogluon.core[all]==0.5.2
Downloading autogluon.core-0.5.2-py3-none-any.whl (210 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 210.4/210.4 kB 184.7 MB/s eta 0:00:00
Collecting autogluon.vision==0.5.2
Downloading autogluon.vision-0.5.2-py3-none-any.whl (48 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 48.8/48.8 kB 156.2 MB/s eta 0:00:00
Collecting autogluon.multimodal==0.5.2
Downloading autogluon.multimodal-0.5.2-py3-none-any.whl (149 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 149.4/149.4 kB 193.6 MB/s eta 0:00:00
Collecting autogluon.features==0.5.2
Downloading autogluon.features-0.5.2-py3-none-any.whl (59 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 59.4/59.4 kB 137.1 MB/s eta 0:00:00
Collecting autogluon.text==0.5.2
Downloading autogluon.text-0.5.2-py3-none-any.whl (61 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 61.9/61.9 kB 156.6 MB/s eta 0:00:00
Collecting autogluon.timeseries[all]==0.5.2
Downloading autogluon.timeseries-0.5.2-py3-none-any.whl (65 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 65.4/65.4 kB 113.3 MB/s eta 0:00:00
Collecting distributed<=2021.11.2,>=2021.09.1
Downloading distributed-2021.11.2-py3-none-any.whl (802 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 802.2/802.2 kB 180.0 MB/s eta 0:00:00
Requirement already satisfied: matplotlib in /usr/local/lib/python3.7/site-packages (from autogluon.core[all]==0.5.2->autogluon) (3.5.0)
Requirement already satisfied: boto3 in /usr/local/lib/python3.7/site-packages (from autogluon.core[all]==0.5.2->autogluon) (1.20.17)
Requirement already satisfied: tqdm>=4.38.0 in /usr/local/lib/python3.7/site-packages (from autogluon.core[all]==0.5.2->autogluon) (4.39.0)
Collecting scipy<1.8.0,>=1.5.4
Downloading scipy-1.7.3-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (38.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 38.1/38.1 MB 158.2 MB/s eta 0:00:0000:0100:01
Requirement already satisfied: pandas!=1.4.0,<1.5,>=1.2.5 in /usr/local/lib/python3.7/site-packages (from autogluon.core[all]==0.5.2->autogluon) (1.3.4)
Collecting dask<=2021.11.2,>=2021.09.1
Downloading dask-2021.11.2-py3-none-any.whl (1.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.0/1.0 MB 215.7 MB/s eta 0:00:00
Collecting autogluon.common==0.5.2
Downloading autogluon.common-0.5.2-py3-none-any.whl (37 kB)
Requirement already satisfied: scikit-learn<1.1,>=1.0.0 in /usr/local/lib/python3.7/site-packages (from autogluon.core[all]==0.5.2->autogluon) (1.0.1)
Requirement already satisfied: requests in /usr/local/lib/python3.7/site-packages (from autogluon.core[all]==0.5.2->autogluon) (2.22.0)
Collecting numpy<1.23,>=1.21
Downloading numpy-1.21.6-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (15.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 15.7/15.7 MB 155.3 MB/s eta 0:00:00a 0:00:01
Collecting hyperopt<0.2.8,>=0.2.7
Downloading hyperopt-0.2.7-py2.py3-none-any.whl (1.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 210.7 MB/s eta 0:00:00
Collecting ray[tune]<1.14,>=1.13
Downloading ray-1.13.0-cp37-cp37m-manylinux2014_x86_64.whl (54.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 54.5/54.5 MB 149.8 MB/s eta 0:00:00a 0:00:01
Requirement already satisfied: psutil<6,>=5.7.3 in /usr/local/lib/python3.7/site-packages (from autogluon.features==0.5.2->autogluon) (5.8.0)
Collecting torch<1.13,>=1.9
Downloading torch-1.12.1-cp37-cp37m-manylinux1_x86_64.whl (776.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 776.3/776.3 MB 163.8 MB/s eta 0:00:0000:0100:01
Collecting torchvision<0.14.0
Downloading torchvision-0.13.1-cp37-cp37m-manylinux1_x86_64.whl (19.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19.1/19.1 MB 154.7 MB/s eta 0:00:00a 0:00:01
Collecting omegaconf<2.2.0,>=2.1.1
Downloading omegaconf-2.1.2-py3-none-any.whl (74 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 74.7/74.7 kB 162.7 MB/s eta 0:00:00
Collecting Pillow<9.1.0,>=9.0.1
Downloading Pillow-9.0.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.3/4.3 MB 154.8 MB/s eta 0:00:00
Collecting smart-open<5.3.0,>=5.2.1
Downloading smart_open-5.2.1-py3-none-any.whl (58 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 58.6/58.6 kB 155.5 MB/s eta 0:00:00
Collecting nlpaug<=1.1.10,>=1.1.10
Downloading nlpaug-1.1.10-py3-none-any.whl (410 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 410.8/410.8 kB 205.2 MB/s eta 0:00:00
Collecting torchmetrics<0.8.0,>=0.7.2
Downloading torchmetrics-0.7.3-py3-none-any.whl (398 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 398.2/398.2 kB 204.1 MB/s eta 0:00:00
Collecting nptyping<1.5.0,>=1.4.4
Downloading nptyping-1.4.4-py3-none-any.whl (31 kB)
Collecting scikit-image<0.20.0,>=0.19.1
Downloading scikit_image-0.19.3-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (13.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.5/13.5 MB 142.7 MB/s eta 0:00:00a 0:00:01
Collecting timm<0.6.0
Downloading timm-0.5.4-py3-none-any.whl (431 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 431.5/431.5 kB 196.8 MB/s eta 0:00:00
Collecting fairscale<=0.4.6,>=0.4.5
Downloading fairscale-0.4.6.tar.gz (248 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 248.2/248.2 kB 196.4 MB/s eta 0:00:00
Installing build dependencies ... done
Getting requirements to build wheel ... done
Installing backend dependencies ... done
Preparing metadata (pyproject.toml) ... done
Collecting nltk<4.0.0,>=3.4.5
Downloading nltk-3.7-py3-none-any.whl (1.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.5/1.5 MB 211.7 MB/s eta 0:00:00
Collecting pytorch-lightning<1.7.0,>=1.6.0
Downloading pytorch_lightning-1.6.5-py3-none-any.whl (585 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 585.9/585.9 kB 199.9 MB/s eta 0:00:00
Collecting pytorch-metric-learning<1.4.0,>=1.3.0
Downloading pytorch_metric_learning-1.3.2-py3-none-any.whl (109 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 109.4/109.4 kB 157.0 MB/s eta 0:00:00
Collecting torchtext<0.14.0
Downloading torchtext-0.13.1-cp37-cp37m-manylinux1_x86_64.whl (1.9 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.9/1.9 MB 214.0 MB/s eta 0:00:00
Collecting protobuf<=3.18.1
Downloading protobuf-3.18.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.1/1.1 MB 205.0 MB/s eta 0:00:00
Collecting sentencepiece<0.2.0,>=0.1.95
Downloading sentencepiece-0.1.97-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 191.2 MB/s eta 0:00:00
Collecting transformers<4.21.0,>=4.18.0
Downloading transformers-4.20.1-py3-none-any.whl (4.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.4/4.4 MB 183.8 MB/s eta 0:00:00
Requirement already satisfied: networkx<3.0,>=2.3 in /usr/local/lib/python3.7/site-packages (from autogluon.tabular[all]==0.5.2->autogluon) (2.6.3)
Collecting xgboost<1.5,>=1.4
Downloading xgboost-1.4.2-py3-none-manylinux2010_x86_64.whl (166.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 166.7/166.7 MB 170.5 MB/s eta 0:00:00a 0:00:01
Collecting lightgbm<3.4,>=3.3
Downloading lightgbm-3.3.3-py3-none-manylinux1_x86_64.whl (2.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.0/2.0 MB 181.3 MB/s eta 0:00:00
Collecting catboost<1.1,>=1.0
Downloading catboost-1.0.6-cp37-none-manylinux1_x86_64.whl (76.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 76.6/76.6 MB 159.9 MB/s eta 0:00:00a 0:00:01
Collecting fastai<2.8,>=2.3.1
Downloading fastai-2.7.9-py3-none-any.whl (225 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 225.5/225.5 kB 195.5 MB/s eta 0:00:00
Collecting autogluon-contrib-nlp==0.0.1b20220208
Downloading autogluon_contrib_nlp-0.0.1b20220208-py3-none-any.whl (157 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 157.3/157.3 kB 193.9 MB/s eta 0:00:00
Collecting gluonts<0.10.0,>=0.8.0
Downloading gluonts-0.9.9-py3-none-any.whl (2.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.8/2.8 MB 189.1 MB/s eta 0:00:00
Collecting tbats~=1.1
Downloading tbats-1.1.1-py3-none-any.whl (43 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 43.8/43.8 kB 137.8 MB/s eta 0:00:00
Collecting pmdarima~=1.8.2
Downloading pmdarima-1.8.5-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (1.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.4/1.4 MB 210.5 MB/s eta 0:00:00
Collecting sktime~=0.11.4
Downloading sktime-0.11.4-py3-none-any.whl (6.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.7/6.7 MB 184.5 MB/s eta 0:00:00
Collecting gluoncv<0.10.6,>=0.10.5
Downloading gluoncv-0.10.5.post0-py2.py3-none-any.whl (1.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 209.9 MB/s eta 0:00:00
Collecting flake8
Downloading flake8-5.0.4-py2.py3-none-any.whl (61 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 61.9/61.9 kB 153.3 MB/s eta 0:00:00
Requirement already satisfied: pyarrow in /usr/local/lib/python3.7/site-packages (from autogluon-contrib-nlp==0.0.1b20220208->autogluon.text==0.5.2->autogluon) (6.0.1)
Collecting sacrebleu
Downloading sacrebleu-2.3.1-py3-none-any.whl (118 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 118.9/118.9 kB 185.6 MB/s eta 0:00:00
Collecting sacremoses>=0.0.38
Downloading sacremoses-0.0.53.tar.gz (880 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 880.6/880.6 kB 209.0 MB/s eta 0:00:00
Preparing metadata (setup.py) ... done
Collecting regex
Downloading regex-2022.9.13-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (757 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 757.0/757.0 kB 213.1 MB/s eta 0:00:00
Collecting contextvars
Downloading contextvars-2.4.tar.gz (9.6 kB)
Preparing metadata (setup.py) ... done
Collecting sentencepiece<0.2.0,>=0.1.95
Downloading sentencepiece-0.1.95-cp37-cp37m-manylinux2014_x86_64.whl (1.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 212.8 MB/s eta 0:00:00
Collecting yacs>=0.1.6
Downloading yacs-0.1.8-py3-none-any.whl (14 kB)
Collecting tokenizers>=0.9.4
Downloading tokenizers-0.13.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.6/7.6 MB 154.6 MB/s eta 0:00:00a 0:00:01
Requirement already satisfied: six in /usr/local/lib/python3.7/site-packages (from catboost<1.1,>=1.0->autogluon.tabular[all]==0.5.2->autogluon) (1.16.0)
Requirement already satisfied: graphviz in /usr/local/lib/python3.7/site-packages (from catboost<1.1,>=1.0->autogluon.tabular[all]==0.5.2->autogluon) (0.8.4)
Requirement already satisfied: plotly in /usr/local/lib/python3.7/site-packages (from catboost<1.1,>=1.0->autogluon.tabular[all]==0.5.2->autogluon) (5.4.0)
Requirement already satisfied: pyyaml in /usr/local/lib/python3.7/site-packages (from dask<=2021.11.2,>=2021.09.1->autogluon.core[all]==0.5.2->autogluon) (5.4.1)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.7/site-packages (from dask<=2021.11.2,>=2021.09.1->autogluon.core[all]==0.5.2->autogluon) (21.3)
Collecting toolz>=0.8.2
Downloading toolz-0.12.0-py3-none-any.whl (55 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 55.8/55.8 kB 137.4 MB/s eta 0:00:00
Requirement already satisfied: cloudpickle>=1.1.1 in /usr/local/lib/python3.7/site-packages (from dask<=2021.11.2,>=2021.09.1->autogluon.core[all]==0.5.2->autogluon) (2.0.0)
Collecting partd>=0.3.10
Downloading partd-1.3.0-py3-none-any.whl (18 kB)
Requirement already satisfied: fsspec>=0.6.0 in /usr/local/lib/python3.7/site-packages (from dask<=2021.11.2,>=2021.09.1->autogluon.core[all]==0.5.2->autogluon) (2021.11.1)
Collecting zict>=0.1.3
Downloading zict-2.2.0-py2.py3-none-any.whl (23 kB)
Requirement already satisfied: tornado>=5 in /usr/local/lib/python3.7/site-packages (from distributed<=2021.11.2,>=2021.09.1->autogluon.core[all]==0.5.2->autogluon) (6.1)
Collecting click>=6.6
Downloading click-8.1.3-py3-none-any.whl (96 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 96.6/96.6 kB 176.0 MB/s eta 0:00:00
Requirement already satisfied: setuptools in /usr/local/lib/python3.7/site-packages (from distributed<=2021.11.2,>=2021.09.1->autogluon.core[all]==0.5.2->autogluon) (65.5.0)
Collecting sortedcontainers!=2.0.0,!=2.0.1
Downloading sortedcontainers-2.4.0-py2.py3-none-any.whl (29 kB)
Collecting tblib>=1.6.0
Downloading tblib-1.7.0-py2.py3-none-any.whl (12 kB)
Collecting msgpack>=0.6.0
Downloading msgpack-1.0.4-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (299 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 299.8/299.8 kB 166.9 MB/s eta 0:00:00
Requirement already satisfied: jinja2 in /usr/local/lib/python3.7/site-packages (from distributed<=2021.11.2,>=2021.09.1->autogluon.core[all]==0.5.2->autogluon) (3.0.3)
Collecting fastdownload<2,>=0.0.5
Downloading fastdownload-0.0.7-py3-none-any.whl (12 kB)
Requirement already satisfied: pip in /usr/local/lib/python3.7/site-packages (from fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.5.2->autogluon) (22.3)
Collecting spacy<4
Downloading spacy-3.4.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.4/6.4 MB 168.5 MB/s eta 0:00:00a 0:00:01
Collecting fastprogress>=0.2.4
Downloading fastprogress-1.0.3-py3-none-any.whl (12 kB)
Collecting fastcore<1.6,>=1.4.5
Downloading fastcore-1.5.27-py3-none-any.whl (67 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 67.1/67.1 kB 165.8 MB/s eta 0:00:00
Requirement already satisfied: portalocker in /usr/local/lib/python3.7/site-packages (from gluoncv<0.10.6,>=0.10.5->autogluon.vision==0.5.2->autogluon) (2.3.2)
Collecting autocfg
Downloading autocfg-0.0.8-py3-none-any.whl (13 kB)
Requirement already satisfied: opencv-python in /usr/local/lib/python3.7/site-packages (from gluoncv<0.10.6,>=0.10.5->autogluon.vision==0.5.2->autogluon) (4.5.4.60)
Collecting pydantic~=1.1
Downloading pydantic-1.10.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.8/11.8 MB 154.0 MB/s eta 0:00:00a 0:00:01
Collecting holidays>=0.9
Downloading holidays-0.16-py3-none-any.whl (184 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 184.6/184.6 kB 197.7 MB/s eta 0:00:00
Requirement already satisfied: typing-extensions~=4.0 in /usr/local/lib/python3.7/site-packages (from gluonts<0.10.0,>=0.8.0->autogluon.timeseries[all]==0.5.2->autogluon) (4.0.1)
Collecting py4j
Downloading py4j-0.10.9.7-py2.py3-none-any.whl (200 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 200.5/200.5 kB 198.8 MB/s eta 0:00:00
Collecting future
Downloading future-0.18.2.tar.gz (829 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 829.2/829.2 kB 213.2 MB/s eta 0:00:00
Preparing metadata (setup.py) ... done
Requirement already satisfied: wheel in /usr/local/lib/python3.7/site-packages (from lightgbm<3.4,>=3.3->autogluon.tabular[all]==0.5.2->autogluon) (0.37.1)
Requirement already satisfied: setuptools-scm>=4 in /usr/local/lib/python3.7/site-packages (from matplotlib->autogluon.core[all]==0.5.2->autogluon) (6.3.2)
Requirement already satisfied: pyparsing>=2.2.1 in /usr/local/lib/python3.7/site-packages (from matplotlib->autogluon.core[all]==0.5.2->autogluon) (3.0.6)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.7/site-packages (from matplotlib->autogluon.core[all]==0.5.2->autogluon) (0.11.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/site-packages (from matplotlib->autogluon.core[all]==0.5.2->autogluon) (1.3.2)
Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.7/site-packages (from matplotlib->autogluon.core[all]==0.5.2->autogluon) (2.8.2)
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.7/site-packages (from matplotlib->autogluon.core[all]==0.5.2->autogluon) (4.28.2)
Requirement already satisfied: joblib in /usr/local/lib/python3.7/site-packages (from nltk<4.0.0,>=3.4.5->autogluon.multimodal==0.5.2->autogluon) (1.1.0)
Collecting typish>=1.7.0
Downloading typish-1.9.3-py3-none-any.whl (45 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 45.1/45.1 kB 74.3 MB/s eta 0:00:00
Collecting antlr4-python3-runtime==4.8
Downloading antlr4-python3-runtime-4.8.tar.gz (112 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 112.4/112.4 kB 146.8 MB/s eta 0:00:00
Preparing metadata (setup.py) ... done
Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.7/site-packages (from pandas!=1.4.0,<1.5,>=1.2.5->autogluon.core[all]==0.5.2->autogluon) (2021.3)
Requirement already satisfied: Cython!=0.29.18,>=0.29 in /usr/local/lib/python3.7/site-packages (from pmdarima~=1.8.2->autogluon.timeseries[all]==0.5.2->autogluon) (0.29.24)
Collecting statsmodels!=0.12.0,>=0.11
Downloading statsmodels-0.13.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.8/9.8 MB 171.1 MB/s eta 0:00:00a 0:00:01
Requirement already satisfied: urllib3 in /usr/local/lib/python3.7/site-packages (from pmdarima~=1.8.2->autogluon.timeseries[all]==0.5.2->autogluon) (1.25.11)
Collecting tqdm>=4.38.0
Downloading tqdm-4.64.1-py2.py3-none-any.whl (78 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 78.5/78.5 kB 173.0 MB/s eta 0:00:00
Collecting pyDeprecate>=0.3.1
Downloading pyDeprecate-0.3.2-py3-none-any.whl (10 kB)
Collecting tensorboard>=2.2.0
Downloading tensorboard-2.10.1-py3-none-any.whl (5.9 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.9/5.9 MB 179.1 MB/s eta 0:00:00
Collecting virtualenv
Downloading virtualenv-20.16.5-py3-none-any.whl (8.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.8/8.8 MB 171.8 MB/s eta 0:00:00a 0:00:01
Collecting aiosignal
Downloading aiosignal-1.2.0-py3-none-any.whl (8.2 kB)
Collecting filelock
Downloading filelock-3.8.0-py3-none-any.whl (10 kB)
Collecting jsonschema
Downloading jsonschema-4.16.0-py3-none-any.whl (83 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 83.1/83.1 kB 167.8 MB/s eta 0:00:00
Collecting frozenlist
Downloading frozenlist-1.3.1-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (148 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 148.0/148.0 kB 192.4 MB/s eta 0:00:00
Collecting click>=6.6
Downloading click-8.0.4-py3-none-any.whl (97 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 97.5/97.5 kB 174.3 MB/s eta 0:00:00
Collecting grpcio<=1.43.0,>=1.28.1
Downloading grpcio-1.43.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.1/4.1 MB 200.9 MB/s eta 0:00:00
Requirement already satisfied: attrs in /usr/local/lib/python3.7/site-packages (from ray[tune]<1.14,>=1.13->autogluon.core[all]==0.5.2->autogluon) (21.2.0)
Requirement already satisfied: tabulate in /usr/local/lib/python3.7/site-packages (from ray[tune]<1.14,>=1.13->autogluon.core[all]==0.5.2->autogluon) (0.8.9)
Collecting tensorboardX>=1.9
Downloading tensorboardX-2.5.1-py2.py3-none-any.whl (125 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 125.4/125.4 kB 137.4 MB/s eta 0:00:00
Requirement already satisfied: idna<2.9,>=2.5 in /usr/local/lib/python3.7/site-packages (from requests->autogluon.core[all]==0.5.2->autogluon) (2.8)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/site-packages (from requests->autogluon.core[all]==0.5.2->autogluon) (2021.10.8)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /usr/local/lib/python3.7/site-packages (from requests->autogluon.core[all]==0.5.2->autogluon) (3.0.4)
Collecting PyWavelets>=1.1.1
Downloading PyWavelets-1.3.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.4/6.4 MB 155.0 MB/s eta 0:00:00a 0:00:01
Collecting tifffile>=2019.7.26
Downloading tifffile-2021.11.2-py3-none-any.whl (178 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 178.9/178.9 kB 181.1 MB/s eta 0:00:00
Requirement already satisfied: imageio>=2.4.1 in /usr/local/lib/python3.7/site-packages (from scikit-image<0.20.0,>=0.19.1->autogluon.multimodal==0.5.2->autogluon) (2.13.1)
Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.7/site-packages (from scikit-learn<1.1,>=1.0.0->autogluon.core[all]==0.5.2->autogluon) (3.0.0)
Requirement already satisfied: numba>=0.53 in /usr/local/lib/python3.7/site-packages (from sktime~=0.11.4->autogluon.timeseries[all]==0.5.2->autogluon) (0.53.1)
Collecting deprecated>=1.2.13
Downloading Deprecated-1.2.13-py2.py3-none-any.whl (9.6 kB)
Collecting huggingface-hub<1.0,>=0.1.0
Downloading huggingface_hub-0.10.1-py3-none-any.whl (163 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 163.5/163.5 kB 186.8 MB/s eta 0:00:00
Requirement already satisfied: importlib-metadata in /usr/local/lib/python3.7/site-packages (from transformers<4.21.0,>=4.18.0->autogluon.multimodal==0.5.2->autogluon) (4.8.2)
Collecting tokenizers>=0.9.4
Downloading tokenizers-0.12.1-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.6/6.6 MB 161.4 MB/s eta 0:00:00a 0:00:01
Requirement already satisfied: botocore<1.24.0,>=1.23.17 in /usr/local/lib/python3.7/site-packages (from boto3->autogluon.core[all]==0.5.2->autogluon) (1.23.17)
Requirement already satisfied: jmespath<1.0.0,>=0.7.1 in /usr/local/lib/python3.7/site-packages (from boto3->autogluon.core[all]==0.5.2->autogluon) (0.10.0)
Requirement already satisfied: s3transfer<0.6.0,>=0.5.0 in /usr/local/lib/python3.7/site-packages (from boto3->autogluon.core[all]==0.5.2->autogluon) (0.5.0)
Collecting wrapt<2,>=1.10
Downloading wrapt-1.14.1-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (75 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 75.2/75.2 kB 165.5 MB/s eta 0:00:00
Collecting aiohttp
Downloading aiohttp-3.8.3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (948 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 948.0/948.0 kB 207.6 MB/s eta 0:00:00
Collecting korean-lunar-calendar
Downloading korean_lunar_calendar-0.3.1-py3-none-any.whl (9.0 kB)
Collecting convertdate>=2.3.0
Downloading convertdate-2.4.0-py3-none-any.whl (47 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 47.9/47.9 kB 143.8 MB/s eta 0:00:00
Collecting hijri-converter
Downloading hijri_converter-2.2.4-py3-none-any.whl (14 kB)
Requirement already satisfied: llvmlite<0.37,>=0.36.0rc1 in /usr/local/lib/python3.7/site-packages (from numba>=0.53->sktime~=0.11.4->autogluon.timeseries[all]==0.5.2->autogluon) (0.36.0)
Collecting locket
Downloading locket-1.0.0-py2.py3-none-any.whl (4.4 kB)
Collecting typing-extensions~=4.0
Downloading typing_extensions-4.4.0-py3-none-any.whl (26 kB)
Requirement already satisfied: tomli>=1.0.0 in /usr/local/lib/python3.7/site-packages (from setuptools-scm>=4->matplotlib->autogluon.core[all]==0.5.2->autogluon) (1.2.2)
Collecting spacy-loggers<2.0.0,>=1.0.0
Downloading spacy_loggers-1.0.3-py3-none-any.whl (9.3 kB)
Collecting typing-extensions~=4.0
Downloading typing_extensions-4.1.1-py3-none-any.whl (26 kB)
Collecting catalogue<2.1.0,>=2.0.6
Downloading catalogue-2.0.8-py3-none-any.whl (17 kB)
Collecting preshed<3.1.0,>=3.0.2
Downloading preshed-3.0.8-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (126 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 126.6/126.6 kB 186.6 MB/s eta 0:00:00
Collecting pathy>=0.3.5
Downloading pathy-0.6.2-py3-none-any.whl (42 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 42.8/42.8 kB 131.2 MB/s eta 0:00:00
Collecting spacy-legacy<3.1.0,>=3.0.10
Downloading spacy_legacy-3.0.10-py2.py3-none-any.whl (21 kB)
Collecting wasabi<1.1.0,>=0.9.1
Downloading wasabi-0.10.1-py3-none-any.whl (26 kB)
Collecting thinc<8.2.0,>=8.1.0
Downloading thinc-8.1.5-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (806 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 806.2/806.2 kB 210.8 MB/s eta 0:00:00
Collecting cymem<2.1.0,>=2.0.2
Downloading cymem-2.0.7-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (36 kB)
Collecting langcodes<4.0.0,>=3.2.0
Downloading langcodes-3.3.0-py3-none-any.whl (181 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 181.6/181.6 kB 197.0 MB/s eta 0:00:00
Collecting murmurhash<1.1.0,>=0.28.0
Downloading murmurhash-1.0.9-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (21 kB)
Collecting typer<0.5.0,>=0.3.0
Downloading typer-0.4.2-py3-none-any.whl (27 kB)
Collecting srsly<3.0.0,>=2.4.3
Downloading srsly-2.4.5-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (490 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 490.0/490.0 kB 209.0 MB/s eta 0:00:00
Collecting patsy>=0.5.2
Downloading patsy-0.5.3-py2.py3-none-any.whl (233 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 233.8/233.8 kB 155.3 MB/s eta 0:00:00
Collecting google-auth-oauthlib<0.5,>=0.4.1
Downloading google_auth_oauthlib-0.4.6-py2.py3-none-any.whl (18 kB)
Collecting absl-py>=0.4
Downloading absl_py-1.3.0-py3-none-any.whl (124 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 124.6/124.6 kB 182.1 MB/s eta 0:00:00
Collecting tensorboard-data-server<0.7.0,>=0.6.0
Downloading tensorboard_data_server-0.6.1-py3-none-manylinux2010_x86_64.whl (4.9 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.9/4.9 MB 202.7 MB/s eta 0:00:00
Collecting tensorboard-plugin-wit>=1.6.0
Downloading tensorboard_plugin_wit-1.8.1-py3-none-any.whl (781 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 781.3/781.3 kB 212.8 MB/s eta 0:00:00
Collecting google-auth<3,>=1.6.3
Downloading google_auth-2.13.0-py2.py3-none-any.whl (174 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 174.5/174.5 kB 183.1 MB/s eta 0:00:00
Requirement already satisfied: werkzeug>=1.0.1 in /usr/local/lib/python3.7/site-packages (from tensorboard>=2.2.0->pytorch-lightning<1.7.0,>=1.6.0->autogluon.multimodal==0.5.2->autogluon) (2.0.2)
Collecting markdown>=2.6.8
Downloading Markdown-3.4.1-py3-none-any.whl (93 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 93.3/93.3 kB 162.7 MB/s eta 0:00:00
Collecting heapdict
Downloading HeapDict-1.0.1-py3-none-any.whl (3.9 kB)
Collecting immutables>=0.9
Downloading immutables-0.19-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (117 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 117.0/117.0 kB 177.7 MB/s eta 0:00:00
Collecting pyflakes<2.6.0,>=2.5.0
Downloading pyflakes-2.5.0-py2.py3-none-any.whl (66 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 66.1/66.1 kB 158.6 MB/s eta 0:00:00
Collecting pycodestyle<2.10.0,>=2.9.0
Downloading pycodestyle-2.9.1-py2.py3-none-any.whl (41 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 41.5/41.5 kB 120.1 MB/s eta 0:00:00
Collecting importlib-metadata
Downloading importlib_metadata-4.2.0-py3-none-any.whl (16 kB)
Collecting mccabe<0.8.0,>=0.7.0
Downloading mccabe-0.7.0-py2.py3-none-any.whl (7.3 kB)
Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.7/site-packages (from importlib-metadata->transformers<4.21.0,>=4.18.0->autogluon.multimodal==0.5.2->autogluon) (3.6.0)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.7/site-packages (from jinja2->distributed<=2021.11.2,>=2021.09.1->autogluon.core[all]==0.5.2->autogluon) (2.0.1)
Collecting pkgutil-resolve-name>=1.3.10
Downloading pkgutil_resolve_name-1.3.10-py3-none-any.whl (4.7 kB)
Collecting importlib-resources>=1.4.0
Downloading importlib_resources-5.10.0-py3-none-any.whl (34 kB)
Collecting pyrsistent!=0.17.0,!=0.17.1,!=0.17.2,>=0.14.0
Downloading pyrsistent-0.18.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (117 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 117.1/117.1 kB 165.5 MB/s eta 0:00:00
Requirement already satisfied: tenacity>=6.2.0 in /usr/local/lib/python3.7/site-packages (from plotly->catboost<1.1,>=1.0->autogluon.tabular[all]==0.5.2->autogluon) (8.0.1)
Requirement already satisfied: colorama in /usr/local/lib/python3.7/site-packages (from sacrebleu->autogluon-contrib-nlp==0.0.1b20220208->autogluon.text==0.5.2->autogluon) (0.4.3)
Collecting lxml
Downloading lxml-4.9.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (6.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.4/6.4 MB 165.8 MB/s eta 0:00:00a 0:00:01
Collecting virtualenv
Downloading virtualenv-20.16.4-py3-none-any.whl (8.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.8/8.8 MB 151.8 MB/s eta 0:00:00a 0:00:01
Downloading virtualenv-20.16.3-py2.py3-none-any.whl (8.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.8/8.8 MB 158.9 MB/s eta 0:00:00a 0:00:01
Downloading virtualenv-20.16.2-py2.py3-none-any.whl (8.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.8/8.8 MB 164.5 MB/s eta 0:00:00a 0:00:01
Collecting platformdirs<3,>=2
Downloading platformdirs-2.5.2-py3-none-any.whl (14 kB)
Collecting distlib<1,>=0.3.1
Downloading distlib-0.3.6-py2.py3-none-any.whl (468 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 468.5/468.5 kB 205.3 MB/s eta 0:00:00
Collecting pymeeus<=1,>=0.3.13
Downloading PyMeeus-0.5.11.tar.gz (5.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.4/5.4 MB 184.6 MB/s eta 0:00:00
Preparing metadata (setup.py) ... done
Collecting cachetools<6.0,>=2.0.0
Downloading cachetools-5.2.0-py3-none-any.whl (9.3 kB)
Collecting pyasn1-modules>=0.2.1
Downloading pyasn1_modules-0.2.8-py2.py3-none-any.whl (155 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 155.3/155.3 kB 184.3 MB/s eta 0:00:00
Requirement already satisfied: rsa<5,>=3.1.4 in /usr/local/lib/python3.7/site-packages (from google-auth<3,>=1.6.3->tensorboard>=2.2.0->pytorch-lightning<1.7.0,>=1.6.0->autogluon.multimodal==0.5.2->autogluon) (4.7.2)
Collecting requests-oauthlib>=0.7.0
Downloading requests_oauthlib-1.3.1-py2.py3-none-any.whl (23 kB)
Collecting markdown>=2.6.8
Downloading Markdown-3.4-py3-none-any.whl (93 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 93.3/93.3 kB 157.6 MB/s eta 0:00:00
Downloading Markdown-3.3.7-py3-none-any.whl (97 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 97.8/97.8 kB 168.3 MB/s eta 0:00:00
Downloading Markdown-3.3.6-py3-none-any.whl (97 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 97.8/97.8 kB 168.8 MB/s eta 0:00:00
Downloading Markdown-3.3.4-py3-none-any.whl (97 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 97.6/97.6 kB 159.5 MB/s eta 0:00:00
Collecting confection<1.0.0,>=0.0.1
Downloading confection-0.0.3-py3-none-any.whl (32 kB)
Collecting blis<0.8.0,>=0.7.8
Downloading blis-0.7.9-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (10.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.2/10.2 MB 162.1 MB/s eta 0:00:00a 0:00:01
Collecting yarl<2.0,>=1.0
Downloading yarl-1.8.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (231 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 231.3/231.3 kB 189.2 MB/s eta 0:00:00
Collecting asynctest==0.13.0
Downloading asynctest-0.13.0-py3-none-any.whl (26 kB)
Collecting multidict<7.0,>=4.5
Downloading multidict-6.0.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (94 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 94.8/94.8 kB 156.5 MB/s eta 0:00:00
Collecting charset-normalizer<3.0,>=2.0
Downloading charset_normalizer-2.1.1-py3-none-any.whl (39 kB)
Collecting async-timeout<5.0,>=4.0.0a3
Downloading async_timeout-4.0.2-py3-none-any.whl (5.8 kB)
Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in /usr/local/lib/python3.7/site-packages (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard>=2.2.0->pytorch-lightning<1.7.0,>=1.6.0->autogluon.multimodal==0.5.2->autogluon) (0.4.8)
Collecting oauthlib>=3.0.0
Downloading oauthlib-3.2.2-py3-none-any.whl (151 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 151.7/151.7 kB 184.9 MB/s eta 0:00:00
Building wheels for collected packages: fairscale, antlr4-python3-runtime, sacremoses, contextvars, future, pymeeus
Building wheel for fairscale (pyproject.toml) ... done
Created wheel for fairscale: filename=fairscale-0.4.6-py3-none-any.whl size=307225 sha256=46a37bbe98045b798d5241f86d0d509641a8c9f0ab4d9da1d562e550776a86ca
Stored in directory: /tmp/pip-ephem-wheel-cache-jwvr3ikj/wheels/0b/8c/fa/a9e102632bcb86e919561cf25ca1e0dd2ec67476f3a5544653
Building wheel for antlr4-python3-runtime (setup.py) ... done
Created wheel for antlr4-python3-runtime: filename=antlr4_python3_runtime-4.8-py3-none-any.whl size=141210 sha256=4df6d785de039497734234d1d0aeecbd195bc00b06345cf6cea90dc7f52c70c7
Stored in directory: /tmp/pip-ephem-wheel-cache-jwvr3ikj/wheels/c9/ef/75/1b8c6588a8a8a15d5a9136608a9d65172a226577e7ae89da31
Building wheel for sacremoses (setup.py) ... done
Created wheel for sacremoses: filename=sacremoses-0.0.53-py3-none-any.whl size=895241 sha256=96cf58e0a8f1c80589e55abd69e9d2f1306887e101a1e89e6c719bd1ae1dab4e
Stored in directory: /tmp/pip-ephem-wheel-cache-jwvr3ikj/wheels/5b/e0/77/05245143a5b31f65af6a21f7afd3219e9fa4896f918af45677
Building wheel for contextvars (setup.py) ... done
Created wheel for contextvars: filename=contextvars-2.4-py3-none-any.whl size=7664 sha256=eb553d92b3ea8cef289dc634b1d9d673beef50154818dd621fdca42f40409f86
Stored in directory: /tmp/pip-ephem-wheel-cache-jwvr3ikj/wheels/1b/4f/f6/2cf0b56beceeb4a516c29f1a061522603b2db256b1c9930fee
Building wheel for future (setup.py) ... done
Created wheel for future: filename=future-0.18.2-py3-none-any.whl size=491058 sha256=08100d14b25e6b5583691164f4315b4f706dd1cf88827a6b0a913821ee23f32f
Stored in directory: /tmp/pip-ephem-wheel-cache-jwvr3ikj/wheels/3e/3c/b4/7132d27620dd551cf00823f798a7190e7320ae7ffb71d1e989
Building wheel for pymeeus (setup.py) ... done
Created wheel for pymeeus: filename=PyMeeus-0.5.11-py3-none-any.whl size=730971 sha256=38b9660916c4fbe80e43f62f75c113ec6af0a63c9fd029398cb93a5b97ffaada
Stored in directory: /tmp/pip-ephem-wheel-cache-jwvr3ikj/wheels/bc/17/d4/0095e29d942940d5653b55f8503c4940e1fad226352c98c0d8
Successfully built fairscale antlr4-python3-runtime sacremoses contextvars future pymeeus
Installing collected packages: wasabi, typish, tokenizers, tensorboard-plugin-wit, sortedcontainers, sentencepiece, pymeeus, py4j, msgpack, korean-lunar-calendar, heapdict, distlib, cymem, antlr4-python3-runtime, zict, yacs, wrapt, typing-extensions, tqdm, toolz, tensorboard-data-server, tblib, spacy-loggers, spacy-legacy, smart-open, regex, pyrsistent, pyflakes, pyDeprecate, pycodestyle, pyasn1-modules, protobuf, platformdirs, pkgutil-resolve-name, Pillow, omegaconf, oauthlib, numpy, murmurhash, multidict, mccabe, lxml, locket, langcodes, importlib-resources, hijri-converter, grpcio, future, frozenlist, filelock, fastprogress, convertdate, charset-normalizer, cachetools, autocfg, asynctest, absl-py, yarl, torch, tifffile, tensorboardX, scipy, sacrebleu, requests-oauthlib, PyWavelets, pydantic, preshed, patsy, partd, nptyping, importlib-metadata, immutables, holidays, google-auth, fastcore, deprecated, catalogue, blis, async-timeout, aiosignal, xgboost, virtualenv, torchvision, torchtext, torchmetrics, statsmodels, srsly, scikit-image, nlpaug, markdown, jsonschema, hyperopt, huggingface-hub, google-auth-oauthlib, flake8, fastdownload, fairscale, dask, contextvars, click, aiohttp, typer, transformers, timm, tensorboard, sktime, sacremoses, ray, pytorch-metric-learning, pmdarima, nltk, lightgbm, gluonts, gluoncv, distributed, confection, catboost, thinc, tbats, pytorch-lightning, pathy, autogluon-contrib-nlp, autogluon.common, spacy, autogluon.features, autogluon.core, fastai, autogluon.vision, autogluon.timeseries, autogluon.tabular, autogluon.multimodal, autogluon.text, autogluon
Attempting uninstall: typing-extensions
Found existing installation: typing_extensions 4.0.1
Uninstalling typing_extensions-4.0.1:
Successfully uninstalled typing_extensions-4.0.1
Attempting uninstall: tqdm
Found existing installation: tqdm 4.39.0
Uninstalling tqdm-4.39.0:
Successfully uninstalled tqdm-4.39.0
Attempting uninstall: protobuf
Found existing installation: protobuf 3.19.1
Uninstalling protobuf-3.19.1:
Successfully uninstalled protobuf-3.19.1
Attempting uninstall: Pillow
Found existing installation: Pillow 8.4.0
Uninstalling Pillow-8.4.0:
Successfully uninstalled Pillow-8.4.0
Attempting uninstall: numpy
Found existing installation: numpy 1.19.1
Uninstalling numpy-1.19.1:
Successfully uninstalled numpy-1.19.1
Attempting uninstall: scipy
Found existing installation: scipy 1.4.1
Uninstalling scipy-1.4.1:
Successfully uninstalled scipy-1.4.1
Attempting uninstall: importlib-metadata
Found existing installation: importlib-metadata 4.8.2
Uninstalling importlib-metadata-4.8.2:
Successfully uninstalled importlib-metadata-4.8.2
Attempting uninstall: gluoncv
Found existing installation: gluoncv 0.8.0
Uninstalling gluoncv-0.8.0:
Successfully uninstalled gluoncv-0.8.0
Successfully installed Pillow-9.0.1 PyWavelets-1.3.0 absl-py-1.3.0 aiohttp-3.8.3 aiosignal-1.2.0 antlr4-python3-runtime-4.8 async-timeout-4.0.2 asynctest-0.13.0 autocfg-0.0.8 autogluon-0.5.2 autogluon-contrib-nlp-0.0.1b20220208 autogluon.common-0.5.2 autogluon.core-0.5.2 autogluon.features-0.5.2 autogluon.multimodal-0.5.2 autogluon.tabular-0.5.2 autogluon.text-0.5.2 autogluon.timeseries-0.5.2 autogluon.vision-0.5.2 blis-0.7.9 cachetools-5.2.0 catalogue-2.0.8 catboost-1.0.6 charset-normalizer-2.1.1 click-8.0.4 confection-0.0.3 contextvars-2.4 convertdate-2.4.0 cymem-2.0.7 dask-2021.11.2 deprecated-1.2.13 distlib-0.3.6 distributed-2021.11.2 fairscale-0.4.6 fastai-2.7.9 fastcore-1.5.27 fastdownload-0.0.7 fastprogress-1.0.3 filelock-3.8.0 flake8-5.0.4 frozenlist-1.3.1 future-0.18.2 gluoncv-0.10.5.post0 gluonts-0.9.9 google-auth-2.13.0 google-auth-oauthlib-0.4.6 grpcio-1.43.0 heapdict-1.0.1 hijri-converter-2.2.4 holidays-0.16 huggingface-hub-0.10.1 hyperopt-0.2.7 immutables-0.19 importlib-metadata-4.2.0 importlib-resources-5.10.0 jsonschema-4.16.0 korean-lunar-calendar-0.3.1 langcodes-3.3.0 lightgbm-3.3.3 locket-1.0.0 lxml-4.9.1 markdown-3.3.4 mccabe-0.7.0 msgpack-1.0.4 multidict-6.0.2 murmurhash-1.0.9 nlpaug-1.1.10 nltk-3.7 nptyping-1.4.4 numpy-1.21.6 oauthlib-3.2.2 omegaconf-2.1.2 partd-1.3.0 pathy-0.6.2 patsy-0.5.3 pkgutil-resolve-name-1.3.10 platformdirs-2.5.2 pmdarima-1.8.5 preshed-3.0.8 protobuf-3.18.1 py4j-0.10.9.7 pyDeprecate-0.3.2 pyasn1-modules-0.2.8 pycodestyle-2.9.1 pydantic-1.10.2 pyflakes-2.5.0 pymeeus-0.5.11 pyrsistent-0.18.1 pytorch-lightning-1.6.5 pytorch-metric-learning-1.3.2 ray-1.13.0 regex-2022.9.13 requests-oauthlib-1.3.1 sacrebleu-2.3.1 sacremoses-0.0.53 scikit-image-0.19.3 scipy-1.7.3 sentencepiece-0.1.95 sktime-0.11.4 smart-open-5.2.1 sortedcontainers-2.4.0 spacy-3.4.2 spacy-legacy-3.0.10 spacy-loggers-1.0.3 srsly-2.4.5 statsmodels-0.13.2 tbats-1.1.1 tblib-1.7.0 tensorboard-2.10.1 tensorboard-data-server-0.6.1 tensorboard-plugin-wit-1.8.1 tensorboardX-2.5.1 thinc-8.1.5 tifffile-2021.11.2 timm-0.5.4 tokenizers-0.12.1 toolz-0.12.0 torch-1.12.1 torchmetrics-0.7.3 torchtext-0.13.1 torchvision-0.13.1 tqdm-4.64.1 transformers-4.20.1 typer-0.4.2 typing-extensions-4.1.1 typish-1.9.3 virtualenv-20.16.2 wasabi-0.10.1 wrapt-1.14.1 xgboost-1.4.2 yacs-0.1.8 yarl-1.8.1 zict-2.2.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
# create the .kaggle directory and an empty kaggle.json file
!mkdir -p /root/.kaggle
!touch /root/.kaggle/kaggle.json
!chmod 600 /root/.kaggle/kaggle.json
# Fill in your user name and key from creating the kaggle account and API token file
import json
kaggle_username = "asmaahanine"
kaggle_key = "3c216d230d35ca764c12c0b34e1a09b0"
# Save API token the kaggle.json file
with open("/root/.kaggle/kaggle.json", "w") as f:
f.write(json.dumps({"username": kaggle_username, "key": kaggle_key}))
!pip install kaggle
Collecting kaggle Using cached kaggle-1.5.12-py3-none-any.whl Requirement already satisfied: certifi in /usr/local/lib/python3.7/site-packages (from kaggle) (2021.10.8) Requirement already satisfied: urllib3 in /usr/local/lib/python3.7/site-packages (from kaggle) (1.25.11) Requirement already satisfied: python-dateutil in /usr/local/lib/python3.7/site-packages (from kaggle) (2.8.2) Collecting python-slugify Using cached python_slugify-6.1.2-py2.py3-none-any.whl (9.4 kB) Requirement already satisfied: requests in /usr/local/lib/python3.7/site-packages (from kaggle) (2.22.0) Requirement already satisfied: tqdm in /usr/local/lib/python3.7/site-packages (from kaggle) (4.64.1) Requirement already satisfied: six>=1.10 in /usr/local/lib/python3.7/site-packages (from kaggle) (1.16.0) Collecting text-unidecode>=1.3 Using cached text_unidecode-1.3-py2.py3-none-any.whl (78 kB) Requirement already satisfied: idna<2.9,>=2.5 in /usr/local/lib/python3.7/site-packages (from requests->kaggle) (2.8) Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /usr/local/lib/python3.7/site-packages (from requests->kaggle) (3.0.4) Installing collected packages: text-unidecode, python-slugify, kaggle Successfully installed kaggle-1.5.12 python-slugify-6.1.2 text-unidecode-1.3 WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
# Download the dataset, it will be in a .zip file so you'll need to unzip it as well.
!kaggle competitions download -c bike-sharing-demand
# If you already downloaded it you can use the -o command to overwrite the file
bike-sharing-demand.zip: Skipping, found more recently modified local copy (use --force to force download) Archive: bike-sharing-demand.zip replace sampleSubmission.csv? [y]es, [n]o, [A]ll, [N]one, [r]ename: ^C /bin/sh: 1: y: not found
import pandas as pd
from autogluon.tabular import TabularPredictor
import seaborn as sns
/usr/local/lib/python3.7/site-packages/tqdm/auto.py:22: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html from .autonotebook import tqdm as notebook_tqdm
# Create the train dataset in pandas by reading the csv
# Set the parsing of the datetime column so you can use some of the `dt` features in pandas later
train = pd.read_csv("train.csv")
# Parsing the datetime column
train.loc[:, "datetime"] = pd.to_datetime(train.loc[:, "datetime"])
train.head()
| datetime | season | holiday | workingday | weather | temp | atemp | humidity | windspeed | casual | registered | count | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2011-01-01 00:00:00 | 1 | 0 | 0 | 1 | 9.84 | 14.395 | 81 | 0.0 | 3 | 13 | 16 |
| 1 | 2011-01-01 01:00:00 | 1 | 0 | 0 | 1 | 9.02 | 13.635 | 80 | 0.0 | 8 | 32 | 40 |
| 2 | 2011-01-01 02:00:00 | 1 | 0 | 0 | 1 | 9.02 | 13.635 | 80 | 0.0 | 5 | 27 | 32 |
| 3 | 2011-01-01 03:00:00 | 1 | 0 | 0 | 1 | 9.84 | 14.395 | 75 | 0.0 | 3 | 10 | 13 |
| 4 | 2011-01-01 04:00:00 | 1 | 0 | 0 | 1 | 9.84 | 14.395 | 75 | 0.0 | 0 | 1 | 1 |
# Simple output of the train dataset to view some of the min/max/varition of the dataset features.
train.describe()
| season | holiday | workingday | weather | temp | atemp | humidity | windspeed | casual | registered | count | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 10886.000000 | 10886.000000 | 10886.000000 | 10886.000000 | 10886.00000 | 10886.000000 | 10886.000000 | 10886.000000 | 10886.000000 | 10886.000000 | 10886.000000 |
| mean | 2.506614 | 0.028569 | 0.680875 | 1.418427 | 20.23086 | 23.655084 | 61.886460 | 12.799395 | 36.021955 | 155.552177 | 191.574132 |
| std | 1.116174 | 0.166599 | 0.466159 | 0.633839 | 7.79159 | 8.474601 | 19.245033 | 8.164537 | 49.960477 | 151.039033 | 181.144454 |
| min | 1.000000 | 0.000000 | 0.000000 | 1.000000 | 0.82000 | 0.760000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 1.000000 |
| 25% | 2.000000 | 0.000000 | 0.000000 | 1.000000 | 13.94000 | 16.665000 | 47.000000 | 7.001500 | 4.000000 | 36.000000 | 42.000000 |
| 50% | 3.000000 | 0.000000 | 1.000000 | 1.000000 | 20.50000 | 24.240000 | 62.000000 | 12.998000 | 17.000000 | 118.000000 | 145.000000 |
| 75% | 4.000000 | 0.000000 | 1.000000 | 2.000000 | 26.24000 | 31.060000 | 77.000000 | 16.997900 | 49.000000 | 222.000000 | 284.000000 |
| max | 4.000000 | 1.000000 | 1.000000 | 4.000000 | 41.00000 | 45.455000 | 100.000000 | 56.996900 | 367.000000 | 886.000000 | 977.000000 |
We can notice that :
# Printing informations about the dataset
train.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 10886 entries, 0 to 10885 Data columns (total 12 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 datetime 10886 non-null datetime64[ns] 1 season 10886 non-null int64 2 holiday 10886 non-null int64 3 workingday 10886 non-null int64 4 weather 10886 non-null int64 5 temp 10886 non-null float64 6 atemp 10886 non-null float64 7 humidity 10886 non-null int64 8 windspeed 10886 non-null float64 9 casual 10886 non-null int64 10 registered 10886 non-null int64 11 count 10886 non-null int64 dtypes: datetime64[ns](1), float64(3), int64(8) memory usage: 1020.7 KB
# Create the test pandas dataframe in pandas by reading the csv, remember to parse the datetime!
test = pd.read_csv("test.csv")
# Parsing the datetime column
test.loc[:, "datetime"] = pd.to_datetime(test.loc[:, "datetime"])
test.head()
| datetime | season | holiday | workingday | weather | temp | atemp | humidity | windspeed | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 2011-01-20 00:00:00 | 1 | 0 | 1 | 1 | 10.66 | 11.365 | 56 | 26.0027 |
| 1 | 2011-01-20 01:00:00 | 1 | 0 | 1 | 1 | 10.66 | 13.635 | 56 | 0.0000 |
| 2 | 2011-01-20 02:00:00 | 1 | 0 | 1 | 1 | 10.66 | 13.635 | 56 | 0.0000 |
| 3 | 2011-01-20 03:00:00 | 1 | 0 | 1 | 1 | 10.66 | 12.880 | 56 | 11.0014 |
| 4 | 2011-01-20 04:00:00 | 1 | 0 | 1 | 1 | 10.66 | 12.880 | 56 | 11.0014 |
# Same thing as train and test dataset
submission = pd.read_csv("sampleSubmission.csv")
submission.head()
| datetime | count | |
|---|---|---|
| 0 | 2011-01-20 00:00:00 | 0 |
| 1 | 2011-01-20 01:00:00 | 0 |
| 2 | 2011-01-20 02:00:00 | 0 |
| 3 | 2011-01-20 03:00:00 | 0 |
| 4 | 2011-01-20 04:00:00 | 0 |
Requirements:
count, so it is the label we are setting.casual and registered columns as they are also not present in the test dataset. root_mean_squared_error as the metric to use for evaluation.best_quality to focus on creating the best model.predictor = TabularPredictor(label="count", problem_type="regression", eval_metric="root_mean_squared_error").fit(
train_data=train.loc[:, train.columns.difference(["casual","registered"])], time_limit=600, presets="best_quality"
)
No path specified. Models will be saved in: "AutogluonModels/ag-20221019_194002/"
Presets specified: ['best_quality']
Stack configuration (auto_stack=True): num_stack_levels=1, num_bag_folds=8, num_bag_sets=20
Beginning AutoGluon training ... Time limit = 600s
AutoGluon will save models to "AutogluonModels/ag-20221019_194002/"
AutoGluon Version: 0.5.2
Python Version: 3.7.10
Operating System: Linux
Train Data Rows: 10886
Train Data Columns: 9
Label Column: count
Preprocessing data ...
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 2049.17 MB
Train Data (Original) Memory Usage: 0.78 MB (0.0% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Note: Converting 2 features to boolean dtype as they only contain 2 unique values.
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Fitting DatetimeFeatureGenerator...
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('datetime', []) : 1 | ['datetime']
('float', []) : 3 | ['atemp', 'temp', 'windspeed']
('int', []) : 5 | ['holiday', 'humidity', 'season', 'weather', 'workingday']
Types of features in processed data (raw dtype, special dtypes):
('float', []) : 3 | ['atemp', 'temp', 'windspeed']
('int', []) : 3 | ['humidity', 'season', 'weather']
('int', ['bool']) : 2 | ['holiday', 'workingday']
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
0.5s = Fit runtime
9 features in original data used to generate 13 features in processed data.
Train Data (Processed) Memory Usage: 0.98 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.57s ...
AutoGluon will gauge predictive performance using evaluation metric: 'root_mean_squared_error'
This metric's sign has been flipped to adhere to being higher_is_better. The metric score can be multiplied by -1 to get the metric value.
To change this, specify the eval_metric parameter of Predictor()
AutoGluon will fit 2 stack levels (L1 to L2) ...
Fitting 11 L1 models ...
Fitting model: KNeighborsUnif_BAG_L1 ... Training model for up to 399.52s of the 599.41s of remaining time.
-101.5462 = Validation score (-root_mean_squared_error)
0.05s = Training runtime
0.1s = Validation runtime
Fitting model: KNeighborsDist_BAG_L1 ... Training model for up to 399.1s of the 598.99s of remaining time.
-84.1251 = Validation score (-root_mean_squared_error)
0.05s = Training runtime
0.1s = Validation runtime
Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 398.7s of the 598.59s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-131.4127 = Validation score (-root_mean_squared_error)
73.17s = Training runtime
8.61s = Validation runtime
Fitting model: LightGBM_BAG_L1 ... Training model for up to 318.82s of the 518.71s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-131.0484 = Validation score (-root_mean_squared_error)
27.7s = Training runtime
1.28s = Validation runtime
Fitting model: RandomForestMSE_BAG_L1 ... Training model for up to 287.48s of the 487.37s of remaining time.
-116.6324 = Validation score (-root_mean_squared_error)
11.37s = Training runtime
0.53s = Validation runtime
Fitting model: CatBoost_BAG_L1 ... Training model for up to 272.83s of the 472.72s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-130.6008 = Validation score (-root_mean_squared_error)
194.46s = Training runtime
0.13s = Validation runtime
Fitting model: ExtraTreesMSE_BAG_L1 ... Training model for up to 75.19s of the 275.08s of remaining time.
-124.4967 = Validation score (-root_mean_squared_error)
4.84s = Training runtime
0.52s = Validation runtime
Fitting model: NeuralNetFastAI_BAG_L1 ... Training model for up to 67.18s of the 267.07s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-137.175 = Validation score (-root_mean_squared_error)
74.35s = Training runtime
0.42s = Validation runtime
Completed 1/20 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L2 ... Training model for up to 360.0s of the 189.46s of remaining time.
-84.1251 = Validation score (-root_mean_squared_error)
0.69s = Training runtime
0.0s = Validation runtime
Fitting 9 L2 models ...
Fitting model: LightGBMXT_BAG_L2 ... Training model for up to 188.69s of the 188.67s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-60.5701 = Validation score (-root_mean_squared_error)
48.96s = Training runtime
2.9s = Validation runtime
Fitting model: LightGBM_BAG_L2 ... Training model for up to 136.19s of the 136.18s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-55.1061 = Validation score (-root_mean_squared_error)
24.2s = Training runtime
0.23s = Validation runtime
Fitting model: RandomForestMSE_BAG_L2 ... Training model for up to 109.0s of the 108.98s of remaining time.
-53.2786 = Validation score (-root_mean_squared_error)
26.37s = Training runtime
0.59s = Validation runtime
Fitting model: CatBoost_BAG_L2 ... Training model for up to 79.6s of the 79.58s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-55.633 = Validation score (-root_mean_squared_error)
71.56s = Training runtime
0.09s = Validation runtime
Fitting model: ExtraTreesMSE_BAG_L2 ... Training model for up to 5.03s of the 5.01s of remaining time.
-53.776 = Validation score (-root_mean_squared_error)
9.1s = Training runtime
0.59s = Validation runtime
Completed 1/20 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L3 ... Training model for up to 360.0s of the -7.32s of remaining time.
-52.7555 = Validation score (-root_mean_squared_error)
0.51s = Training runtime
0.0s = Validation runtime
AutoGluon training complete, total runtime = 608.04s ... Best model: "WeightedEnsemble_L3"
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("AutogluonModels/ag-20221019_194002/")
predictor.fit_summary()
*** Summary of fit() ***
Estimated performance of each model:
model score_val pred_time_val fit_time pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order
0 WeightedEnsemble_L3 -52.729069 14.257364 445.129123 0.001256 0.433843 3 True 15
1 RandomForestMSE_BAG_L2 -53.324848 13.405554 412.855981 0.590444 25.980774 2 True 12
2 ExtraTreesMSE_BAG_L2 -53.717484 13.400470 395.048464 0.585360 8.173257 2 True 14
3 LightGBM_BAG_L2 -54.975211 13.080305 410.541249 0.265195 23.666042 2 True 11
4 CatBoost_BAG_L2 -55.578418 12.894655 450.923761 0.079545 64.048554 2 True 13
5 LightGBMXT_BAG_L2 -60.497960 16.269038 438.927994 3.453928 52.052787 2 True 10
6 KNeighborsDist_BAG_L1 -84.125061 0.103696 0.033458 0.103696 0.033458 1 True 2
7 WeightedEnsemble_L2 -84.125061 0.104901 0.601218 0.001205 0.567760 2 True 9
8 KNeighborsUnif_BAG_L1 -101.546199 0.103069 0.114331 0.103069 0.114331 1 True 1
9 RandomForestMSE_BAG_L1 -116.632421 0.537841 10.762428 0.537841 10.762428 1 True 5
10 ExtraTreesMSE_BAG_L1 -124.496689 0.516548 4.834809 0.516548 4.834809 1 True 7
11 CatBoost_BAG_L1 -130.600759 0.103025 192.814319 0.103025 192.814319 1 True 6
12 LightGBM_BAG_L1 -131.048402 1.525897 29.324282 1.525897 29.324282 1 True 4
13 LightGBMXT_BAG_L1 -131.412741 9.414168 76.680547 9.414168 76.680547 1 True 3
14 NeuralNetFastAI_BAG_L1 -137.554677 0.510866 72.311033 0.510866 72.311033 1 True 8
Number of models trained: 15
Types of models trained:
{'StackerEnsembleModel_XT', 'StackerEnsembleModel_RF', 'StackerEnsembleModel_LGB', 'StackerEnsembleModel_NNFastAiTabular', 'WeightedEnsembleModel', 'StackerEnsembleModel_KNN', 'StackerEnsembleModel_CatBoost'}
Bagging used: True (with 8 folds)
Multi-layer stack-ensembling used: True (with 3 levels)
Feature Metadata (Processed):
(raw dtype, special dtypes):
('float', []) : 3 | ['atemp', 'temp', 'windspeed']
('int', []) : 3 | ['humidity', 'season', 'weather']
('int', ['bool']) : 2 | ['holiday', 'workingday']
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
Plot summary of models saved to file: AutogluonModels/ag-20221019_181447/SummaryOfModels.html
*** End of fit() summary ***
{'model_types': {'KNeighborsUnif_BAG_L1': 'StackerEnsembleModel_KNN',
'KNeighborsDist_BAG_L1': 'StackerEnsembleModel_KNN',
'LightGBMXT_BAG_L1': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L1': 'StackerEnsembleModel_LGB',
'RandomForestMSE_BAG_L1': 'StackerEnsembleModel_RF',
'CatBoost_BAG_L1': 'StackerEnsembleModel_CatBoost',
'ExtraTreesMSE_BAG_L1': 'StackerEnsembleModel_XT',
'NeuralNetFastAI_BAG_L1': 'StackerEnsembleModel_NNFastAiTabular',
'WeightedEnsemble_L2': 'WeightedEnsembleModel',
'LightGBMXT_BAG_L2': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L2': 'StackerEnsembleModel_LGB',
'RandomForestMSE_BAG_L2': 'StackerEnsembleModel_RF',
'CatBoost_BAG_L2': 'StackerEnsembleModel_CatBoost',
'ExtraTreesMSE_BAG_L2': 'StackerEnsembleModel_XT',
'WeightedEnsemble_L3': 'WeightedEnsembleModel'},
'model_performance': {'KNeighborsUnif_BAG_L1': -101.54619908446061,
'KNeighborsDist_BAG_L1': -84.12506123181602,
'LightGBMXT_BAG_L1': -131.41274077052907,
'LightGBM_BAG_L1': -131.04840164127194,
'RandomForestMSE_BAG_L1': -116.63242058947374,
'CatBoost_BAG_L1': -130.6007588943428,
'ExtraTreesMSE_BAG_L1': -124.49668948784444,
'NeuralNetFastAI_BAG_L1': -137.55467740409978,
'WeightedEnsemble_L2': -84.12506123181602,
'LightGBMXT_BAG_L2': -60.4979596005761,
'LightGBM_BAG_L2': -54.97521083134642,
'RandomForestMSE_BAG_L2': -53.32484832178292,
'CatBoost_BAG_L2': -55.5784178479148,
'ExtraTreesMSE_BAG_L2': -53.717483530403854,
'WeightedEnsemble_L3': -52.729068758216826},
'model_best': 'WeightedEnsemble_L3',
'model_paths': {'KNeighborsUnif_BAG_L1': 'AutogluonModels/ag-20221019_181447/models/KNeighborsUnif_BAG_L1/',
'KNeighborsDist_BAG_L1': 'AutogluonModels/ag-20221019_181447/models/KNeighborsDist_BAG_L1/',
'LightGBMXT_BAG_L1': 'AutogluonModels/ag-20221019_181447/models/LightGBMXT_BAG_L1/',
'LightGBM_BAG_L1': 'AutogluonModels/ag-20221019_181447/models/LightGBM_BAG_L1/',
'RandomForestMSE_BAG_L1': 'AutogluonModels/ag-20221019_181447/models/RandomForestMSE_BAG_L1/',
'CatBoost_BAG_L1': 'AutogluonModels/ag-20221019_181447/models/CatBoost_BAG_L1/',
'ExtraTreesMSE_BAG_L1': 'AutogluonModels/ag-20221019_181447/models/ExtraTreesMSE_BAG_L1/',
'NeuralNetFastAI_BAG_L1': 'AutogluonModels/ag-20221019_181447/models/NeuralNetFastAI_BAG_L1/',
'WeightedEnsemble_L2': 'AutogluonModels/ag-20221019_181447/models/WeightedEnsemble_L2/',
'LightGBMXT_BAG_L2': 'AutogluonModels/ag-20221019_181447/models/LightGBMXT_BAG_L2/',
'LightGBM_BAG_L2': 'AutogluonModels/ag-20221019_181447/models/LightGBM_BAG_L2/',
'RandomForestMSE_BAG_L2': 'AutogluonModels/ag-20221019_181447/models/RandomForestMSE_BAG_L2/',
'CatBoost_BAG_L2': 'AutogluonModels/ag-20221019_181447/models/CatBoost_BAG_L2/',
'ExtraTreesMSE_BAG_L2': 'AutogluonModels/ag-20221019_181447/models/ExtraTreesMSE_BAG_L2/',
'WeightedEnsemble_L3': 'AutogluonModels/ag-20221019_181447/models/WeightedEnsemble_L3/'},
'model_fit_times': {'KNeighborsUnif_BAG_L1': 0.11433124542236328,
'KNeighborsDist_BAG_L1': 0.03345799446105957,
'LightGBMXT_BAG_L1': 76.68054699897766,
'LightGBM_BAG_L1': 29.32428216934204,
'RandomForestMSE_BAG_L1': 10.762428045272827,
'CatBoost_BAG_L1': 192.8143186569214,
'ExtraTreesMSE_BAG_L1': 4.834809064865112,
'NeuralNetFastAI_BAG_L1': 72.31103277206421,
'WeightedEnsemble_L2': 0.5677599906921387,
'LightGBMXT_BAG_L2': 52.05278706550598,
'LightGBM_BAG_L2': 23.66604208946228,
'RandomForestMSE_BAG_L2': 25.98077416419983,
'CatBoost_BAG_L2': 64.04855418205261,
'ExtraTreesMSE_BAG_L2': 8.17325735092163,
'WeightedEnsemble_L3': 0.43384289741516113},
'model_pred_times': {'KNeighborsUnif_BAG_L1': 0.10306930541992188,
'KNeighborsDist_BAG_L1': 0.10369586944580078,
'LightGBMXT_BAG_L1': 9.414168119430542,
'LightGBM_BAG_L1': 1.5258972644805908,
'RandomForestMSE_BAG_L1': 0.5378408432006836,
'CatBoost_BAG_L1': 0.10302495956420898,
'ExtraTreesMSE_BAG_L1': 0.5165479183197021,
'NeuralNetFastAI_BAG_L1': 0.5108659267425537,
'WeightedEnsemble_L2': 0.0012049674987792969,
'LightGBMXT_BAG_L2': 3.453927516937256,
'LightGBM_BAG_L2': 0.2651948928833008,
'RandomForestMSE_BAG_L2': 0.5904438495635986,
'CatBoost_BAG_L2': 0.0795445442199707,
'ExtraTreesMSE_BAG_L2': 0.5853595733642578,
'WeightedEnsemble_L3': 0.0012555122375488281},
'num_bag_folds': 8,
'max_stack_level': 3,
'model_hyperparams': {'KNeighborsUnif_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'KNeighborsDist_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'LightGBMXT_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'RandomForestMSE_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'CatBoost_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'ExtraTreesMSE_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'NeuralNetFastAI_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'WeightedEnsemble_L2': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBMXT_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'RandomForestMSE_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'CatBoost_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'ExtraTreesMSE_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'WeightedEnsemble_L3': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True}},
'leaderboard': model score_val pred_time_val fit_time \
0 WeightedEnsemble_L3 -52.729069 14.257364 445.129123
1 RandomForestMSE_BAG_L2 -53.324848 13.405554 412.855981
2 ExtraTreesMSE_BAG_L2 -53.717484 13.400470 395.048464
3 LightGBM_BAG_L2 -54.975211 13.080305 410.541249
4 CatBoost_BAG_L2 -55.578418 12.894655 450.923761
5 LightGBMXT_BAG_L2 -60.497960 16.269038 438.927994
6 KNeighborsDist_BAG_L1 -84.125061 0.103696 0.033458
7 WeightedEnsemble_L2 -84.125061 0.104901 0.601218
8 KNeighborsUnif_BAG_L1 -101.546199 0.103069 0.114331
9 RandomForestMSE_BAG_L1 -116.632421 0.537841 10.762428
10 ExtraTreesMSE_BAG_L1 -124.496689 0.516548 4.834809
11 CatBoost_BAG_L1 -130.600759 0.103025 192.814319
12 LightGBM_BAG_L1 -131.048402 1.525897 29.324282
13 LightGBMXT_BAG_L1 -131.412741 9.414168 76.680547
14 NeuralNetFastAI_BAG_L1 -137.554677 0.510866 72.311033
pred_time_val_marginal fit_time_marginal stack_level can_infer \
0 0.001256 0.433843 3 True
1 0.590444 25.980774 2 True
2 0.585360 8.173257 2 True
3 0.265195 23.666042 2 True
4 0.079545 64.048554 2 True
5 3.453928 52.052787 2 True
6 0.103696 0.033458 1 True
7 0.001205 0.567760 2 True
8 0.103069 0.114331 1 True
9 0.537841 10.762428 1 True
10 0.516548 4.834809 1 True
11 0.103025 192.814319 1 True
12 1.525897 29.324282 1 True
13 9.414168 76.680547 1 True
14 0.510866 72.311033 1 True
fit_order
0 15
1 12
2 14
3 11
4 13
5 10
6 2
7 9
8 1
9 5
10 7
11 6
12 4
13 3
14 8 }
Let's plot training scores of the top performers of the tested models.
predictor.leaderboard(silent=True).plot(kind="bar", x="model", y="score_val")
<AxesSubplot:xlabel='model'>
test.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 6493 entries, 0 to 6492 Data columns (total 9 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 datetime 6493 non-null datetime64[ns] 1 season 6493 non-null int64 2 holiday 6493 non-null int64 3 workingday 6493 non-null int64 4 weather 6493 non-null int64 5 temp 6493 non-null float64 6 atemp 6493 non-null float64 7 humidity 6493 non-null int64 8 windspeed 6493 non-null float64 dtypes: datetime64[ns](1), float64(3), int64(5) memory usage: 456.7 KB
predictions = predictor.predict(test)
# Describe the `predictions` series to see if there are any negative values
print("Predictions: \n", predictions)
Predictions:
0 24.271076
1 40.670219
2 44.759418
3 48.270988
4 51.002129
...
6488 158.364975
6489 158.364975
6490 154.558319
6491 147.399673
6492 153.469833
Name: count, Length: 6493, dtype: float32
predictions.describe()
count 6493.000000 mean 100.682121 std 90.388634 min 3.041300 25% 20.355358 50% 62.666965 75% 170.241760 max 363.189880 Name: count, dtype: float64
# How many negative values do we have?
print("Negative predictions are : \n", predictions[predictions<0])
Negative predictions are : Series([], Name: count, dtype: float32)
We have no negative values.
# Set them to zero
predictions[predictions<0] = 0
submission["count"] = predictions
submission.to_csv("submission.csv", index=False)
!kaggle competitions submit -c bike-sharing-demand -f submission.csv -m "3nd raw submission"
100%|█████████████████████████████████████████| 188k/188k [00:00<00:00, 365kB/s] Successfully submitted to Bike Sharing Demand
My Submissions¶!kaggle competitions submissions -c bike-sharing-demand | tail -n +1 | head -n 6
fileName date description status publicScore privateScore --------------------------- ------------------- -------------------- -------- ----------- ------------ submission.csv 2022-10-19 18:30:41 3nd raw submission complete 1.80895 1.80895 submission_new_features.csv 2022-10-16 22:46:40 new features complete 1.80152 1.80152 submission.csv 2022-10-16 20:05:41 2nd raw submission complete 1.80406 1.80406 submission.csv 2022-10-16 20:05:24 first raw submission complete 1.80406 1.80406
1.80406¶# Create a histogram of all features to show the distribution of each one relative to the data. This is part of the exploritory data analysis
import matplotlib.pyplot as plt
fig = plt.figure(figsize = (15,20))
ax = fig.gca()
train.hist(ax = ax)
/usr/local/lib/python3.7/site-packages/ipykernel_launcher.py:5: UserWarning: To output multiple subplots, the figure containing the passed axes is being cleared """
array([[<AxesSubplot:title={'center':'datetime'}>,
<AxesSubplot:title={'center':'season'}>,
<AxesSubplot:title={'center':'holiday'}>],
[<AxesSubplot:title={'center':'workingday'}>,
<AxesSubplot:title={'center':'weather'}>,
<AxesSubplot:title={'center':'temp'}>],
[<AxesSubplot:title={'center':'atemp'}>,
<AxesSubplot:title={'center':'humidity'}>,
<AxesSubplot:title={'center':'windspeed'}>],
[<AxesSubplot:title={'center':'casual'}>,
<AxesSubplot:title={'center':'registered'}>,
<AxesSubplot:title={'center':'count'}>]], dtype=object)
train.corr()
| season | holiday | workingday | weather | temp | atemp | humidity | windspeed | casual | registered | count | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| season | 1.000000 | 0.029368 | -0.008126 | 0.008879 | 0.258689 | 0.264744 | 0.190610 | -0.147121 | 0.096758 | 0.164011 | 0.163439 |
| holiday | 0.029368 | 1.000000 | -0.250491 | -0.007074 | 0.000295 | -0.005215 | 0.001929 | 0.008409 | 0.043799 | -0.020956 | -0.005393 |
| workingday | -0.008126 | -0.250491 | 1.000000 | 0.033772 | 0.029966 | 0.024660 | -0.010880 | 0.013373 | -0.319111 | 0.119460 | 0.011594 |
| weather | 0.008879 | -0.007074 | 0.033772 | 1.000000 | -0.055035 | -0.055376 | 0.406244 | 0.007261 | -0.135918 | -0.109340 | -0.128655 |
| temp | 0.258689 | 0.000295 | 0.029966 | -0.055035 | 1.000000 | 0.984948 | -0.064949 | -0.017852 | 0.467097 | 0.318571 | 0.394454 |
| atemp | 0.264744 | -0.005215 | 0.024660 | -0.055376 | 0.984948 | 1.000000 | -0.043536 | -0.057473 | 0.462067 | 0.314635 | 0.389784 |
| humidity | 0.190610 | 0.001929 | -0.010880 | 0.406244 | -0.064949 | -0.043536 | 1.000000 | -0.318607 | -0.348187 | -0.265458 | -0.317371 |
| windspeed | -0.147121 | 0.008409 | 0.013373 | 0.007261 | -0.017852 | -0.057473 | -0.318607 | 1.000000 | 0.092276 | 0.091052 | 0.101369 |
| casual | 0.096758 | 0.043799 | -0.319111 | -0.135918 | 0.467097 | 0.462067 | -0.348187 | 0.092276 | 1.000000 | 0.497250 | 0.690414 |
| registered | 0.164011 | -0.020956 | 0.119460 | -0.109340 | 0.318571 | 0.314635 | -0.265458 | 0.091052 | 0.497250 | 1.000000 | 0.970948 |
| count | 0.163439 | -0.005393 | 0.011594 | -0.128655 | 0.394454 | 0.389784 | -0.317371 | 0.101369 | 0.690414 | 0.970948 | 1.000000 |
sns.clustermap(train.corr())
<seaborn.matrix.ClusterGrid at 0x7f9d6d0a4110>
We can notice from the correlation matrix that the temp and atemp parameters are highly correlated. This makes sense, and we may want to only keep one parameter as keeping both would imply duplicating information. Other correlations are not as high.
sns.pairplot(train)
<seaborn.axisgrid.PairGrid at 0x7f9d5f56af90>
We can notice from the pairplot a few things :
import matplotlib.pyplot as plt
fig = plt.figure(figsize = (15,20))
ax = fig.gca()
train.sort_values('datetime').plot.line('datetime','count',ax = ax)
<AxesSubplot:xlabel='datetime'>
The plot of counted rentals timeseries shows that the number of rentals increased from 2011 to 2013. We can also notice that within the same season, the number of rentals varies depending on the month. It would be interesting to add time features extracted from the dateime to the dataset to gain more insight on rental count variation for different granularity levels of time.
# create a new feature
# Separating datetime into year, month and day features
train_new = train
train_new["year"] = train_new.datetime.dt.year
train_new["month"] = train_new.datetime.dt.month
train_new["day"] = train_new.datetime.dt.day
train_new["hour"] = train_new.datetime.dt.hour
test_new = test
test_new["year"] = test_new.datetime.dt.year
test_new["month"] = test_new.datetime.dt.month
test_new["day"] = test_new.datetime.dt.day
test_new["hour"] = test_new.datetime.dt.hour
import matplotlib.pyplot as plt
fig, axes = plt.subplots(nrows=2, ncols=2, figsize=(10, 8))
train.plot(ax=axes[0, 0], x="year", y="count", kind="scatter")
train.plot(ax=axes[0, 1], x="month", y="count", kind="scatter")
train.plot(ax=axes[1, 0], x="day", y="count", kind="scatter")
train.plot(ax=axes[1, 1], x="hour", y="count", kind="scatter")
<AxesSubplot:xlabel='hour', ylabel='count'>
We can notice that more bikes are rented at 08am and 5pm, which correspond to the start and end of work day hours.
# Create a histogram of all features to show the distribution of each one relative to the data. This is part of the exploritory data analysis
import matplotlib.pyplot as plt
fig = plt.figure(figsize = (15,20))
ax = fig.gca()
train_new.hist(ax = ax)
/usr/local/lib/python3.7/site-packages/ipykernel_launcher.py:5: UserWarning: To output multiple subplots, the figure containing the passed axes is being cleared """
array([[<AxesSubplot:title={'center':'datetime'}>,
<AxesSubplot:title={'center':'season'}>,
<AxesSubplot:title={'center':'holiday'}>,
<AxesSubplot:title={'center':'workingday'}>],
[<AxesSubplot:title={'center':'weather'}>,
<AxesSubplot:title={'center':'temp'}>,
<AxesSubplot:title={'center':'atemp'}>,
<AxesSubplot:title={'center':'humidity'}>],
[<AxesSubplot:title={'center':'windspeed'}>,
<AxesSubplot:title={'center':'casual'}>,
<AxesSubplot:title={'center':'registered'}>,
<AxesSubplot:title={'center':'count'}>],
[<AxesSubplot:title={'center':'year'}>,
<AxesSubplot:title={'center':'month'}>,
<AxesSubplot:title={'center':'day'}>,
<AxesSubplot:title={'center':'hour'}>]], dtype=object)
train_new["season"] = train_new.season.astype('category')
train_new["weather"] = train_new.weather.astype('category')
train_new["holiday"] = train_new.holiday.astype('category')
train_new["workingday"] = train_new.workingday.astype('category')
test_new["season"] = test_new.season.astype('category')
test_new["weather"] = test_new.weather.astype('category')
test_new["holiday"] = test_new.holiday.astype('category')
test_new["workingday"] = test_new.workingday.astype('category')
# View are new feature
train_new.head()
| datetime | season | holiday | workingday | weather | temp | atemp | humidity | windspeed | casual | registered | count | year | month | day | hour | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2011-01-01 00:00:00 | 1 | 0 | 0 | 1 | 9.84 | 14.395 | 81 | 0.0 | 3 | 13 | 16 | 2011 | 1 | 1 | 0 |
| 1 | 2011-01-01 01:00:00 | 1 | 0 | 0 | 1 | 9.02 | 13.635 | 80 | 0.0 | 8 | 32 | 40 | 2011 | 1 | 1 | 1 |
| 2 | 2011-01-01 02:00:00 | 1 | 0 | 0 | 1 | 9.02 | 13.635 | 80 | 0.0 | 5 | 27 | 32 | 2011 | 1 | 1 | 2 |
| 3 | 2011-01-01 03:00:00 | 1 | 0 | 0 | 1 | 9.84 | 14.395 | 75 | 0.0 | 3 | 10 | 13 | 2011 | 1 | 1 | 3 |
| 4 | 2011-01-01 04:00:00 | 1 | 0 | 0 | 1 | 9.84 | 14.395 | 75 | 0.0 | 0 | 1 | 1 | 2011 | 1 | 1 | 4 |
# View histogram of all features again now with the hour feature
import matplotlib.pyplot as plt
fig = plt.figure(figsize = (15,20))
ax = fig.gca()
train_new.hist(ax = ax)
/usr/local/lib/python3.7/site-packages/ipykernel_launcher.py:5: UserWarning: To output multiple subplots, the figure containing the passed axes is being cleared """
array([[<AxesSubplot:title={'center':'datetime'}>,
<AxesSubplot:title={'center':'temp'}>,
<AxesSubplot:title={'center':'atemp'}>],
[<AxesSubplot:title={'center':'humidity'}>,
<AxesSubplot:title={'center':'windspeed'}>,
<AxesSubplot:title={'center':'casual'}>],
[<AxesSubplot:title={'center':'registered'}>,
<AxesSubplot:title={'center':'count'}>,
<AxesSubplot:title={'center':'year'}>],
[<AxesSubplot:title={'center':'month'}>,
<AxesSubplot:title={'center':'day'}>,
<AxesSubplot:title={'center':'hour'}>]], dtype=object)
predictor_new_features = TabularPredictor(label="count", problem_type="regression", eval_metric="root_mean_squared_error").fit(
train_data=train_new.loc[:, train_new.columns.difference(["casual","registered"])], time_limit=600, presets="best_quality"
)
No path specified. Models will be saved in: "AutogluonModels/ag-20221023_205528/"
Presets specified: ['best_quality']
Stack configuration (auto_stack=True): num_stack_levels=1, num_bag_folds=8, num_bag_sets=20
Beginning AutoGluon training ... Time limit = 600s
AutoGluon will save models to "AutogluonModels/ag-20221023_205528/"
AutoGluon Version: 0.5.2
Python Version: 3.7.10
Operating System: Linux
Train Data Rows: 10886
Train Data Columns: 13
Label Column: count
Preprocessing data ...
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 1982.02 MB
Train Data (Original) Memory Usage: 1.13 MB (0.1% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Note: Converting 3 features to boolean dtype as they only contain 2 unique values.
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Fitting DatetimeFeatureGenerator...
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('datetime', []) : 1 | ['datetime']
('float', []) : 3 | ['atemp', 'temp', 'windspeed']
('int', []) : 9 | ['day', 'holiday', 'hour', 'humidity', 'month', ...]
Types of features in processed data (raw dtype, special dtypes):
('float', []) : 3 | ['atemp', 'temp', 'windspeed']
('int', []) : 6 | ['day', 'hour', 'humidity', 'month', 'season', ...]
('int', ['bool']) : 3 | ['holiday', 'workingday', 'year']
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
0.1s = Fit runtime
13 features in original data used to generate 17 features in processed data.
Train Data (Processed) Memory Usage: 1.25 MB (0.1% of available memory)
Data preprocessing and feature engineering runtime = 0.19s ...
AutoGluon will gauge predictive performance using evaluation metric: 'root_mean_squared_error'
This metric's sign has been flipped to adhere to being higher_is_better. The metric score can be multiplied by -1 to get the metric value.
To change this, specify the eval_metric parameter of Predictor()
AutoGluon will fit 2 stack levels (L1 to L2) ...
Fitting 11 L1 models ...
Fitting model: KNeighborsUnif_BAG_L1 ... Training model for up to 399.77s of the 599.8s of remaining time.
-101.5462 = Validation score (-root_mean_squared_error)
0.04s = Training runtime
0.1s = Validation runtime
Fitting model: KNeighborsDist_BAG_L1 ... Training model for up to 399.39s of the 599.42s of remaining time.
-84.1251 = Validation score (-root_mean_squared_error)
0.04s = Training runtime
0.1s = Validation runtime
Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 399.01s of the 599.04s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-34.4947 = Validation score (-root_mean_squared_error)
91.94s = Training runtime
10.76s = Validation runtime
Fitting model: LightGBM_BAG_L1 ... Training model for up to 301.77s of the 501.8s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-33.9992 = Validation score (-root_mean_squared_error)
45.93s = Training runtime
3.3s = Validation runtime
Fitting model: RandomForestMSE_BAG_L1 ... Training model for up to 252.02s of the 452.05s of remaining time.
-38.3986 = Validation score (-root_mean_squared_error)
14.11s = Training runtime
0.58s = Validation runtime
Fitting model: CatBoost_BAG_L1 ... Training model for up to 234.83s of the 434.86s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-33.2593 = Validation score (-root_mean_squared_error)
197.88s = Training runtime
0.15s = Validation runtime
Fitting model: ExtraTreesMSE_BAG_L1 ... Training model for up to 33.93s of the 233.95s of remaining time.
-38.4819 = Validation score (-root_mean_squared_error)
6.2s = Training runtime
0.56s = Validation runtime
Fitting model: NeuralNetFastAI_BAG_L1 ... Training model for up to 24.63s of the 224.66s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-102.6377 = Validation score (-root_mean_squared_error)
44.0s = Training runtime
0.45s = Validation runtime
Completed 1/20 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L2 ... Training model for up to 360.0s of the 176.55s of remaining time.
-31.9641 = Validation score (-root_mean_squared_error)
0.51s = Training runtime
0.0s = Validation runtime
Fitting 9 L2 models ...
Fitting model: LightGBMXT_BAG_L2 ... Training model for up to 175.97s of the 175.95s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-31.054 = Validation score (-root_mean_squared_error)
27.65s = Training runtime
0.59s = Validation runtime
Fitting model: LightGBM_BAG_L2 ... Training model for up to 144.86s of the 144.84s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-30.748 = Validation score (-root_mean_squared_error)
24.02s = Training runtime
0.22s = Validation runtime
Fitting model: RandomForestMSE_BAG_L2 ... Training model for up to 117.68s of the 117.66s of remaining time.
-31.7362 = Validation score (-root_mean_squared_error)
31.06s = Training runtime
0.75s = Validation runtime
Fitting model: CatBoost_BAG_L2 ... Training model for up to 83.49s of the 83.47s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-30.4666 = Validation score (-root_mean_squared_error)
67.4s = Training runtime
0.1s = Validation runtime
Fitting model: ExtraTreesMSE_BAG_L2 ... Training model for up to 13.11s of the 13.09s of remaining time.
-31.4382 = Validation score (-root_mean_squared_error)
9.6s = Training runtime
0.62s = Validation runtime
Fitting model: NeuralNetFastAI_BAG_L2 ... Training model for up to 0.39s of the 0.37s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
Time limit exceeded... Skipping NeuralNetFastAI_BAG_L2.
Completed 1/20 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L3 ... Training model for up to 360.0s of the -8.48s of remaining time.
-30.2526 = Validation score (-root_mean_squared_error)
0.32s = Training runtime
0.0s = Validation runtime
AutoGluon training complete, total runtime = 609.0s ... Best model: "WeightedEnsemble_L3"
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("AutogluonModels/ag-20221023_205528/")
predictor_new_features.fit_summary()
*** Summary of fit() ***
Estimated performance of each model:
model score_val pred_time_val fit_time pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order
0 WeightedEnsemble_L3 -30.252623 17.556348 529.136375 0.000796 0.323466 3 True 15
1 CatBoost_BAG_L2 -30.466598 16.121886 467.535128 0.095946 67.399496 2 True 13
2 LightGBM_BAG_L2 -30.747974 16.246617 424.158355 0.220676 24.022724 2 True 11
3 LightGBMXT_BAG_L2 -31.053955 16.615151 427.785796 0.589211 27.650164 2 True 10
4 ExtraTreesMSE_BAG_L2 -31.438204 16.649720 409.740524 0.623779 9.604893 2 True 14
5 RandomForestMSE_BAG_L2 -31.736159 16.777495 431.193593 0.751554 31.057962 2 True 12
6 WeightedEnsemble_L2 -31.964149 14.909500 350.399838 0.000958 0.506435 2 True 9
7 CatBoost_BAG_L1 -33.259343 0.154472 197.879862 0.154472 197.879862 1 True 6
8 LightGBM_BAG_L1 -33.999247 3.303141 45.928752 3.303141 45.928752 1 True 4
9 LightGBMXT_BAG_L1 -34.494653 10.764218 91.938161 10.764218 91.938161 1 True 3
10 RandomForestMSE_BAG_L1 -38.398605 0.583024 14.109219 0.583024 14.109219 1 True 5
11 ExtraTreesMSE_BAG_L1 -38.481929 0.564917 6.199713 0.564917 6.199713 1 True 7
12 KNeighborsDist_BAG_L1 -84.125061 0.103687 0.037409 0.103687 0.037409 1 True 2
13 KNeighborsUnif_BAG_L1 -101.546199 0.102817 0.042574 0.102817 0.042574 1 True 1
14 NeuralNetFastAI_BAG_L1 -102.637717 0.449665 43.999941 0.449665 43.999941 1 True 8
Number of models trained: 15
Types of models trained:
{'StackerEnsembleModel_KNN', 'StackerEnsembleModel_NNFastAiTabular', 'StackerEnsembleModel_XT', 'StackerEnsembleModel_CatBoost', 'StackerEnsembleModel_RF', 'WeightedEnsembleModel', 'StackerEnsembleModel_LGB'}
Bagging used: True (with 8 folds)
Multi-layer stack-ensembling used: True (with 3 levels)
Feature Metadata (Processed):
(raw dtype, special dtypes):
('float', []) : 3 | ['atemp', 'temp', 'windspeed']
('int', []) : 6 | ['day', 'hour', 'humidity', 'month', 'season', ...]
('int', ['bool']) : 3 | ['holiday', 'workingday', 'year']
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
Plot summary of models saved to file: AutogluonModels/ag-20221023_205528/SummaryOfModels.html
*** End of fit() summary ***
{'model_types': {'KNeighborsUnif_BAG_L1': 'StackerEnsembleModel_KNN',
'KNeighborsDist_BAG_L1': 'StackerEnsembleModel_KNN',
'LightGBMXT_BAG_L1': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L1': 'StackerEnsembleModel_LGB',
'RandomForestMSE_BAG_L1': 'StackerEnsembleModel_RF',
'CatBoost_BAG_L1': 'StackerEnsembleModel_CatBoost',
'ExtraTreesMSE_BAG_L1': 'StackerEnsembleModel_XT',
'NeuralNetFastAI_BAG_L1': 'StackerEnsembleModel_NNFastAiTabular',
'WeightedEnsemble_L2': 'WeightedEnsembleModel',
'LightGBMXT_BAG_L2': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L2': 'StackerEnsembleModel_LGB',
'RandomForestMSE_BAG_L2': 'StackerEnsembleModel_RF',
'CatBoost_BAG_L2': 'StackerEnsembleModel_CatBoost',
'ExtraTreesMSE_BAG_L2': 'StackerEnsembleModel_XT',
'WeightedEnsemble_L3': 'WeightedEnsembleModel'},
'model_performance': {'KNeighborsUnif_BAG_L1': -101.54619908446061,
'KNeighborsDist_BAG_L1': -84.12506123181602,
'LightGBMXT_BAG_L1': -34.494653086618804,
'LightGBM_BAG_L1': -33.99924669761093,
'RandomForestMSE_BAG_L1': -38.39860545954656,
'CatBoost_BAG_L1': -33.25934270773738,
'ExtraTreesMSE_BAG_L1': -38.481929349341094,
'NeuralNetFastAI_BAG_L1': -102.63771662405847,
'WeightedEnsemble_L2': -31.964149052890754,
'LightGBMXT_BAG_L2': -31.0539547053999,
'LightGBM_BAG_L2': -30.747973719282207,
'RandomForestMSE_BAG_L2': -31.736159000059434,
'CatBoost_BAG_L2': -30.466598375481624,
'ExtraTreesMSE_BAG_L2': -31.438203849948902,
'WeightedEnsemble_L3': -30.25262269369684},
'model_best': 'WeightedEnsemble_L3',
'model_paths': {'KNeighborsUnif_BAG_L1': 'AutogluonModels/ag-20221023_205528/models/KNeighborsUnif_BAG_L1/',
'KNeighborsDist_BAG_L1': 'AutogluonModels/ag-20221023_205528/models/KNeighborsDist_BAG_L1/',
'LightGBMXT_BAG_L1': 'AutogluonModels/ag-20221023_205528/models/LightGBMXT_BAG_L1/',
'LightGBM_BAG_L1': 'AutogluonModels/ag-20221023_205528/models/LightGBM_BAG_L1/',
'RandomForestMSE_BAG_L1': 'AutogluonModels/ag-20221023_205528/models/RandomForestMSE_BAG_L1/',
'CatBoost_BAG_L1': 'AutogluonModels/ag-20221023_205528/models/CatBoost_BAG_L1/',
'ExtraTreesMSE_BAG_L1': 'AutogluonModels/ag-20221023_205528/models/ExtraTreesMSE_BAG_L1/',
'NeuralNetFastAI_BAG_L1': 'AutogluonModels/ag-20221023_205528/models/NeuralNetFastAI_BAG_L1/',
'WeightedEnsemble_L2': 'AutogluonModels/ag-20221023_205528/models/WeightedEnsemble_L2/',
'LightGBMXT_BAG_L2': 'AutogluonModels/ag-20221023_205528/models/LightGBMXT_BAG_L2/',
'LightGBM_BAG_L2': 'AutogluonModels/ag-20221023_205528/models/LightGBM_BAG_L2/',
'RandomForestMSE_BAG_L2': 'AutogluonModels/ag-20221023_205528/models/RandomForestMSE_BAG_L2/',
'CatBoost_BAG_L2': 'AutogluonModels/ag-20221023_205528/models/CatBoost_BAG_L2/',
'ExtraTreesMSE_BAG_L2': 'AutogluonModels/ag-20221023_205528/models/ExtraTreesMSE_BAG_L2/',
'WeightedEnsemble_L3': 'AutogluonModels/ag-20221023_205528/models/WeightedEnsemble_L3/'},
'model_fit_times': {'KNeighborsUnif_BAG_L1': 0.04257369041442871,
'KNeighborsDist_BAG_L1': 0.03740859031677246,
'LightGBMXT_BAG_L1': 91.93816137313843,
'LightGBM_BAG_L1': 45.928752183914185,
'RandomForestMSE_BAG_L1': 14.10921859741211,
'CatBoost_BAG_L1': 197.8798623085022,
'ExtraTreesMSE_BAG_L1': 6.199713468551636,
'NeuralNetFastAI_BAG_L1': 43.99994111061096,
'WeightedEnsemble_L2': 0.5064353942871094,
'LightGBMXT_BAG_L2': 27.650164365768433,
'LightGBM_BAG_L2': 24.02272391319275,
'RandomForestMSE_BAG_L2': 31.05796194076538,
'CatBoost_BAG_L2': 67.39949631690979,
'ExtraTreesMSE_BAG_L2': 9.60489273071289,
'WeightedEnsemble_L3': 0.32346630096435547},
'model_pred_times': {'KNeighborsUnif_BAG_L1': 0.10281658172607422,
'KNeighborsDist_BAG_L1': 0.10368657112121582,
'LightGBMXT_BAG_L1': 10.764217853546143,
'LightGBM_BAG_L1': 3.3031413555145264,
'RandomForestMSE_BAG_L1': 0.583024263381958,
'CatBoost_BAG_L1': 0.15447187423706055,
'ExtraTreesMSE_BAG_L1': 0.5649173259735107,
'NeuralNetFastAI_BAG_L1': 0.449664831161499,
'WeightedEnsemble_L2': 0.0009582042694091797,
'LightGBMXT_BAG_L2': 0.5892107486724854,
'LightGBM_BAG_L2': 0.22067594528198242,
'RandomForestMSE_BAG_L2': 0.7515542507171631,
'CatBoost_BAG_L2': 0.09594583511352539,
'ExtraTreesMSE_BAG_L2': 0.6237790584564209,
'WeightedEnsemble_L3': 0.0007958412170410156},
'num_bag_folds': 8,
'max_stack_level': 3,
'model_hyperparams': {'KNeighborsUnif_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'KNeighborsDist_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'LightGBMXT_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'RandomForestMSE_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'CatBoost_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'ExtraTreesMSE_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'NeuralNetFastAI_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'WeightedEnsemble_L2': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBMXT_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'RandomForestMSE_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'CatBoost_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'ExtraTreesMSE_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'WeightedEnsemble_L3': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True}},
'leaderboard': model score_val pred_time_val fit_time \
0 WeightedEnsemble_L3 -30.252623 17.556348 529.136375
1 CatBoost_BAG_L2 -30.466598 16.121886 467.535128
2 LightGBM_BAG_L2 -30.747974 16.246617 424.158355
3 LightGBMXT_BAG_L2 -31.053955 16.615151 427.785796
4 ExtraTreesMSE_BAG_L2 -31.438204 16.649720 409.740524
5 RandomForestMSE_BAG_L2 -31.736159 16.777495 431.193593
6 WeightedEnsemble_L2 -31.964149 14.909500 350.399838
7 CatBoost_BAG_L1 -33.259343 0.154472 197.879862
8 LightGBM_BAG_L1 -33.999247 3.303141 45.928752
9 LightGBMXT_BAG_L1 -34.494653 10.764218 91.938161
10 RandomForestMSE_BAG_L1 -38.398605 0.583024 14.109219
11 ExtraTreesMSE_BAG_L1 -38.481929 0.564917 6.199713
12 KNeighborsDist_BAG_L1 -84.125061 0.103687 0.037409
13 KNeighborsUnif_BAG_L1 -101.546199 0.102817 0.042574
14 NeuralNetFastAI_BAG_L1 -102.637717 0.449665 43.999941
pred_time_val_marginal fit_time_marginal stack_level can_infer \
0 0.000796 0.323466 3 True
1 0.095946 67.399496 2 True
2 0.220676 24.022724 2 True
3 0.589211 27.650164 2 True
4 0.623779 9.604893 2 True
5 0.751554 31.057962 2 True
6 0.000958 0.506435 2 True
7 0.154472 197.879862 1 True
8 3.303141 45.928752 1 True
9 10.764218 91.938161 1 True
10 0.583024 14.109219 1 True
11 0.564917 6.199713 1 True
12 0.103687 0.037409 1 True
13 0.102817 0.042574 1 True
14 0.449665 43.999941 1 True
fit_order
0 15
1 13
2 11
3 10
4 14
5 12
6 9
7 6
8 4
9 3
10 5
11 7
12 2
13 1
14 8 }
performance = predictor_new_features.evaluate(test_new)
print("The performance indicators are : \n", performance)
/usr/local/lib/python3.7/site-packages/scipy/stats/stats.py:4023: PearsonRConstantInputWarning: An input array is constant; the correlation coefficient is not defined.
warnings.warn(PearsonRConstantInputWarning())
Evaluation: root_mean_squared_error on test data: -196.38650113235116
Note: Scores are always higher_is_better. This metric score can be multiplied by -1 to get the metric value.
Evaluations on test data:
{
"root_mean_squared_error": -196.38650113235116,
"mean_squared_error": -38567.65782700696,
"mean_absolute_error": -149.05909678496474,
"r2": 0.0,
"pearsonr": NaN,
"median_absolute_error": -116.65669250488281
}
The performance indicators are :
{'root_mean_squared_error': -196.38650113235116, 'mean_squared_error': -38567.65782700696, 'mean_absolute_error': -149.05909678496474, 'r2': 0.0, 'pearsonr': nan, 'median_absolute_error': -116.65669250488281}
# Remember to set all negative values to zero
predictions_new_features = predictor.predict(test_new)
print('Negative predictions are :', predictions_new_features[predictions_new_features<0])
Negative predictions are : Series([], Name: count, dtype: float32)
# Same submitting predictions
submission_new_features = submission
submission_new_features["count"] = predictions_new_features
submission_new_features.to_csv("submission_new_features.csv", index=False)
!kaggle competitions submit -c bike-sharing-demand -f submission_new_features.csv -m "new features + set weather, holiday, season, workingday "
100%|█████████████████████████████████████████| 188k/188k [00:00<00:00, 290kB/s] Successfully submitted to Bike Sharing Demand
!kaggle competitions submissions -c bike-sharing-demand | tail -n +1 | head -n 6
fileName date description status publicScore privateScore --------------------------- ------------------- -------------------------------------------------------- -------- ----------- ------------ submission_new_features.csv 2022-10-19 20:23:34 new features + set weather, holiday, season, workingday complete 1.80119 1.80119 submission_new_features.csv 2022-10-19 19:25:25 new features + set weather, holiday, season, workingday complete 1.80895 1.80895 submission.csv 2022-10-19 18:30:41 3nd raw submission complete 1.80895 1.80895 submission_new_features.csv 2022-10-16 22:46:40 new features complete 1.80152 1.80152
1.80119¶predictor_wo_datetime = TabularPredictor(label="count", problem_type="regression", eval_metric="root_mean_squared_error").fit(
train_data=train_new.loc[:, train_new.columns.difference(["datetime","casual","registered"])], time_limit=600, presets="best_quality"
)
No path specified. Models will be saved in: "AutogluonModels/ag-20221023_202702/"
Presets specified: ['best_quality']
Stack configuration (auto_stack=True): num_stack_levels=1, num_bag_folds=8, num_bag_sets=20
Beginning AutoGluon training ... Time limit = 600s
AutoGluon will save models to "AutogluonModels/ag-20221023_202702/"
AutoGluon Version: 0.5.2
Python Version: 3.7.10
Operating System: Linux
Train Data Rows: 10886
Train Data Columns: 12
Label Column: count
Preprocessing data ...
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 2982.12 MB
Train Data (Original) Memory Usage: 1.05 MB (0.0% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Note: Converting 3 features to boolean dtype as they only contain 2 unique values.
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('float', []) : 3 | ['atemp', 'temp', 'windspeed']
('int', []) : 9 | ['day', 'holiday', 'hour', 'humidity', 'month', ...]
Types of features in processed data (raw dtype, special dtypes):
('float', []) : 3 | ['atemp', 'temp', 'windspeed']
('int', []) : 6 | ['day', 'hour', 'humidity', 'month', 'season', ...]
('int', ['bool']) : 3 | ['holiday', 'workingday', 'year']
0.2s = Fit runtime
12 features in original data used to generate 12 features in processed data.
Train Data (Processed) Memory Usage: 0.82 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.29s ...
AutoGluon will gauge predictive performance using evaluation metric: 'root_mean_squared_error'
This metric's sign has been flipped to adhere to being higher_is_better. The metric score can be multiplied by -1 to get the metric value.
To change this, specify the eval_metric parameter of Predictor()
AutoGluon will fit 2 stack levels (L1 to L2) ...
Fitting 11 L1 models ...
Fitting model: KNeighborsUnif_BAG_L1 ... Training model for up to 399.7s of the 599.69s of remaining time.
-123.781 = Validation score (-root_mean_squared_error)
0.07s = Training runtime
0.3s = Validation runtime
Fitting model: KNeighborsDist_BAG_L1 ... Training model for up to 399.07s of the 599.07s of remaining time.
-119.1941 = Validation score (-root_mean_squared_error)
0.03s = Training runtime
0.2s = Validation runtime
Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 398.61s of the 598.6s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
2022-10-23 20:27:05,406 WARNING services.py:2013 -- WARNING: The object store is using /tmp instead of /dev/shm because /dev/shm has only 416284672 bytes available. This will harm performance! You may be able to free up space by deleting files in /dev/shm. If you are inside a Docker container, you can increase /dev/shm size by passing '--shm-size=0.96gb' to 'docker run' (or add it to the run_options list in a Ray cluster config). Make sure to set this to more than 30% of available RAM.
-37.5315 = Validation score (-root_mean_squared_error)
102.5s = Training runtime
20.97s = Validation runtime
Fitting model: LightGBM_BAG_L1 ... Training model for up to 288.6s of the 488.59s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-37.8753 = Validation score (-root_mean_squared_error)
39.71s = Training runtime
3.34s = Validation runtime
Fitting model: RandomForestMSE_BAG_L1 ... Training model for up to 245.08s of the 445.08s of remaining time.
-42.1538 = Validation score (-root_mean_squared_error)
9.96s = Training runtime
0.57s = Validation runtime
Fitting model: CatBoost_BAG_L1 ... Training model for up to 231.84s of the 431.83s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-37.551 = Validation score (-root_mean_squared_error)
196.23s = Training runtime
0.15s = Validation runtime
Fitting model: ExtraTreesMSE_BAG_L1 ... Training model for up to 32.44s of the 232.44s of remaining time.
-41.5225 = Validation score (-root_mean_squared_error)
5.22s = Training runtime
0.7s = Validation runtime
Fitting model: NeuralNetFastAI_BAG_L1 ... Training model for up to 23.79s of the 223.78s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-103.9499 = Validation score (-root_mean_squared_error)
45.05s = Training runtime
0.49s = Validation runtime
Completed 1/20 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L2 ... Training model for up to 360.0s of the 175.38s of remaining time.
-36.0226 = Validation score (-root_mean_squared_error)
1.01s = Training runtime
0.0s = Validation runtime
Fitting 9 L2 models ...
Fitting model: LightGBMXT_BAG_L2 ... Training model for up to 174.26s of the 174.23s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-36.9115 = Validation score (-root_mean_squared_error)
23.33s = Training runtime
0.27s = Validation runtime
Fitting model: LightGBM_BAG_L2 ... Training model for up to 147.36s of the 147.33s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-36.5501 = Validation score (-root_mean_squared_error)
22.53s = Training runtime
0.11s = Validation runtime
Fitting model: RandomForestMSE_BAG_L2 ... Training model for up to 121.22s of the 121.19s of remaining time.
-37.1253 = Validation score (-root_mean_squared_error)
28.43s = Training runtime
0.88s = Validation runtime
Fitting model: CatBoost_BAG_L2 ... Training model for up to 89.22s of the 89.2s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-36.2752 = Validation score (-root_mean_squared_error)
30.21s = Training runtime
0.06s = Validation runtime
Fitting model: ExtraTreesMSE_BAG_L2 ... Training model for up to 55.91s of the 55.88s of remaining time.
-36.3866 = Validation score (-root_mean_squared_error)
8.45s = Training runtime
0.68s = Validation runtime
Fitting model: NeuralNetFastAI_BAG_L2 ... Training model for up to 43.93s of the 43.91s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-36.7555 = Validation score (-root_mean_squared_error)
59.6s = Training runtime
0.48s = Validation runtime
Completed 1/20 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L3 ... Training model for up to 360.0s of the -19.01s of remaining time.
-35.9994 = Validation score (-root_mean_squared_error)
0.43s = Training runtime
0.0s = Validation runtime
AutoGluon training complete, total runtime = 619.68s ... Best model: "WeightedEnsemble_L3"
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("AutogluonModels/ag-20221023_202702/")
predictor_wo_datetime.fit_summary()
*** Summary of fit() ***
Estimated performance of each model:
model score_val pred_time_val fit_time pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order
0 WeightedEnsemble_L3 -35.999372 28.068213 520.002633 0.001175 0.430852 3 True 16
1 WeightedEnsemble_L2 -36.022646 25.732827 354.641560 0.001569 1.012878 2 True 9
2 CatBoost_BAG_L2 -36.275163 26.794154 428.989806 0.060683 30.213794 2 True 13
3 ExtraTreesMSE_BAG_L2 -36.386579 27.413426 407.229927 0.679956 8.453914 2 True 14
4 LightGBM_BAG_L2 -36.550106 26.843183 421.308851 0.109712 22.532838 2 True 11
5 NeuralNetFastAI_BAG_L2 -36.755484 27.216686 458.371235 0.483216 59.595222 2 True 15
6 LightGBMXT_BAG_L2 -36.911479 27.003105 422.109320 0.269634 23.333308 2 True 10
7 RandomForestMSE_BAG_L2 -37.125313 27.612287 427.202532 0.878816 28.426519 2 True 12
8 LightGBMXT_BAG_L1 -37.531467 20.966673 102.497085 20.966673 102.497085 1 True 3
9 CatBoost_BAG_L1 -37.551041 0.148058 196.234118 0.148058 196.234118 1 True 6
10 LightGBM_BAG_L1 -37.875308 3.339185 39.712124 3.339185 39.712124 1 True 4
11 ExtraTreesMSE_BAG_L1 -41.522503 0.703364 5.222335 0.703364 5.222335 1 True 7
12 RandomForestMSE_BAG_L1 -42.153772 0.573977 9.963019 0.573977 9.963019 1 True 5
13 NeuralNetFastAI_BAG_L1 -103.949897 0.494219 45.049742 0.494219 45.049742 1 True 8
14 KNeighborsDist_BAG_L1 -119.194060 0.204184 0.029573 0.204184 0.029573 1 True 2
15 KNeighborsUnif_BAG_L1 -123.781003 0.303809 0.068016 0.303809 0.068016 1 True 1
Number of models trained: 16
Types of models trained:
{'StackerEnsembleModel_KNN', 'StackerEnsembleModel_NNFastAiTabular', 'StackerEnsembleModel_XT', 'StackerEnsembleModel_CatBoost', 'StackerEnsembleModel_RF', 'WeightedEnsembleModel', 'StackerEnsembleModel_LGB'}
Bagging used: True (with 8 folds)
Multi-layer stack-ensembling used: True (with 3 levels)
Feature Metadata (Processed):
(raw dtype, special dtypes):
('float', []) : 3 | ['atemp', 'temp', 'windspeed']
('int', []) : 6 | ['day', 'hour', 'humidity', 'month', 'season', ...]
('int', ['bool']) : 3 | ['holiday', 'workingday', 'year']
Plot summary of models saved to file: AutogluonModels/ag-20221023_202702/SummaryOfModels.html
*** End of fit() summary ***
{'model_types': {'KNeighborsUnif_BAG_L1': 'StackerEnsembleModel_KNN',
'KNeighborsDist_BAG_L1': 'StackerEnsembleModel_KNN',
'LightGBMXT_BAG_L1': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L1': 'StackerEnsembleModel_LGB',
'RandomForestMSE_BAG_L1': 'StackerEnsembleModel_RF',
'CatBoost_BAG_L1': 'StackerEnsembleModel_CatBoost',
'ExtraTreesMSE_BAG_L1': 'StackerEnsembleModel_XT',
'NeuralNetFastAI_BAG_L1': 'StackerEnsembleModel_NNFastAiTabular',
'WeightedEnsemble_L2': 'WeightedEnsembleModel',
'LightGBMXT_BAG_L2': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L2': 'StackerEnsembleModel_LGB',
'RandomForestMSE_BAG_L2': 'StackerEnsembleModel_RF',
'CatBoost_BAG_L2': 'StackerEnsembleModel_CatBoost',
'ExtraTreesMSE_BAG_L2': 'StackerEnsembleModel_XT',
'NeuralNetFastAI_BAG_L2': 'StackerEnsembleModel_NNFastAiTabular',
'WeightedEnsemble_L3': 'WeightedEnsembleModel'},
'model_performance': {'KNeighborsUnif_BAG_L1': -123.78100255079445,
'KNeighborsDist_BAG_L1': -119.19406017177057,
'LightGBMXT_BAG_L1': -37.531466866876414,
'LightGBM_BAG_L1': -37.87530799234931,
'RandomForestMSE_BAG_L1': -42.15377231942091,
'CatBoost_BAG_L1': -37.5510406746713,
'ExtraTreesMSE_BAG_L1': -41.52250272210862,
'NeuralNetFastAI_BAG_L1': -103.94989656652723,
'WeightedEnsemble_L2': -36.022645556361475,
'LightGBMXT_BAG_L2': -36.91147853077231,
'LightGBM_BAG_L2': -36.55010647506041,
'RandomForestMSE_BAG_L2': -37.12531330226235,
'CatBoost_BAG_L2': -36.27516272432256,
'ExtraTreesMSE_BAG_L2': -36.3865787954654,
'NeuralNetFastAI_BAG_L2': -36.75548401910065,
'WeightedEnsemble_L3': -35.99937150190432},
'model_best': 'WeightedEnsemble_L3',
'model_paths': {'KNeighborsUnif_BAG_L1': 'AutogluonModels/ag-20221023_202702/models/KNeighborsUnif_BAG_L1/',
'KNeighborsDist_BAG_L1': 'AutogluonModels/ag-20221023_202702/models/KNeighborsDist_BAG_L1/',
'LightGBMXT_BAG_L1': 'AutogluonModels/ag-20221023_202702/models/LightGBMXT_BAG_L1/',
'LightGBM_BAG_L1': 'AutogluonModels/ag-20221023_202702/models/LightGBM_BAG_L1/',
'RandomForestMSE_BAG_L1': 'AutogluonModels/ag-20221023_202702/models/RandomForestMSE_BAG_L1/',
'CatBoost_BAG_L1': 'AutogluonModels/ag-20221023_202702/models/CatBoost_BAG_L1/',
'ExtraTreesMSE_BAG_L1': 'AutogluonModels/ag-20221023_202702/models/ExtraTreesMSE_BAG_L1/',
'NeuralNetFastAI_BAG_L1': 'AutogluonModels/ag-20221023_202702/models/NeuralNetFastAI_BAG_L1/',
'WeightedEnsemble_L2': 'AutogluonModels/ag-20221023_202702/models/WeightedEnsemble_L2/',
'LightGBMXT_BAG_L2': 'AutogluonModels/ag-20221023_202702/models/LightGBMXT_BAG_L2/',
'LightGBM_BAG_L2': 'AutogluonModels/ag-20221023_202702/models/LightGBM_BAG_L2/',
'RandomForestMSE_BAG_L2': 'AutogluonModels/ag-20221023_202702/models/RandomForestMSE_BAG_L2/',
'CatBoost_BAG_L2': 'AutogluonModels/ag-20221023_202702/models/CatBoost_BAG_L2/',
'ExtraTreesMSE_BAG_L2': 'AutogluonModels/ag-20221023_202702/models/ExtraTreesMSE_BAG_L2/',
'NeuralNetFastAI_BAG_L2': 'AutogluonModels/ag-20221023_202702/models/NeuralNetFastAI_BAG_L2/',
'WeightedEnsemble_L3': 'AutogluonModels/ag-20221023_202702/models/WeightedEnsemble_L3/'},
'model_fit_times': {'KNeighborsUnif_BAG_L1': 0.06801605224609375,
'KNeighborsDist_BAG_L1': 0.02957296371459961,
'LightGBMXT_BAG_L1': 102.4970850944519,
'LightGBM_BAG_L1': 39.71212434768677,
'RandomForestMSE_BAG_L1': 9.963019371032715,
'CatBoost_BAG_L1': 196.2341182231903,
'ExtraTreesMSE_BAG_L1': 5.222334861755371,
'NeuralNetFastAI_BAG_L1': 45.04974174499512,
'WeightedEnsemble_L2': 1.012878179550171,
'LightGBMXT_BAG_L2': 23.33330774307251,
'LightGBM_BAG_L2': 22.532838106155396,
'RandomForestMSE_BAG_L2': 28.42651891708374,
'CatBoost_BAG_L2': 30.213793754577637,
'ExtraTreesMSE_BAG_L2': 8.453914165496826,
'NeuralNetFastAI_BAG_L2': 59.59522247314453,
'WeightedEnsemble_L3': 0.43085169792175293},
'model_pred_times': {'KNeighborsUnif_BAG_L1': 0.30380916595458984,
'KNeighborsDist_BAG_L1': 0.20418429374694824,
'LightGBMXT_BAG_L1': 20.966673374176025,
'LightGBM_BAG_L1': 3.3391854763031006,
'RandomForestMSE_BAG_L1': 0.5739772319793701,
'CatBoost_BAG_L1': 0.14805817604064941,
'ExtraTreesMSE_BAG_L1': 0.7033638954162598,
'NeuralNetFastAI_BAG_L1': 0.4942190647125244,
'WeightedEnsemble_L2': 0.0015690326690673828,
'LightGBMXT_BAG_L2': 0.2696342468261719,
'LightGBM_BAG_L2': 0.10971236228942871,
'RandomForestMSE_BAG_L2': 0.8788161277770996,
'CatBoost_BAG_L2': 0.06068301200866699,
'ExtraTreesMSE_BAG_L2': 0.6799557209014893,
'NeuralNetFastAI_BAG_L2': 0.4832158088684082,
'WeightedEnsemble_L3': 0.0011749267578125},
'num_bag_folds': 8,
'max_stack_level': 3,
'model_hyperparams': {'KNeighborsUnif_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'KNeighborsDist_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'LightGBMXT_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'RandomForestMSE_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'CatBoost_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'ExtraTreesMSE_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'NeuralNetFastAI_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'WeightedEnsemble_L2': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBMXT_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'RandomForestMSE_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'CatBoost_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'ExtraTreesMSE_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'NeuralNetFastAI_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'WeightedEnsemble_L3': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True}},
'leaderboard': model score_val pred_time_val fit_time \
0 WeightedEnsemble_L3 -35.999372 28.068213 520.002633
1 WeightedEnsemble_L2 -36.022646 25.732827 354.641560
2 CatBoost_BAG_L2 -36.275163 26.794154 428.989806
3 ExtraTreesMSE_BAG_L2 -36.386579 27.413426 407.229927
4 LightGBM_BAG_L2 -36.550106 26.843183 421.308851
5 NeuralNetFastAI_BAG_L2 -36.755484 27.216686 458.371235
6 LightGBMXT_BAG_L2 -36.911479 27.003105 422.109320
7 RandomForestMSE_BAG_L2 -37.125313 27.612287 427.202532
8 LightGBMXT_BAG_L1 -37.531467 20.966673 102.497085
9 CatBoost_BAG_L1 -37.551041 0.148058 196.234118
10 LightGBM_BAG_L1 -37.875308 3.339185 39.712124
11 ExtraTreesMSE_BAG_L1 -41.522503 0.703364 5.222335
12 RandomForestMSE_BAG_L1 -42.153772 0.573977 9.963019
13 NeuralNetFastAI_BAG_L1 -103.949897 0.494219 45.049742
14 KNeighborsDist_BAG_L1 -119.194060 0.204184 0.029573
15 KNeighborsUnif_BAG_L1 -123.781003 0.303809 0.068016
pred_time_val_marginal fit_time_marginal stack_level can_infer \
0 0.001175 0.430852 3 True
1 0.001569 1.012878 2 True
2 0.060683 30.213794 2 True
3 0.679956 8.453914 2 True
4 0.109712 22.532838 2 True
5 0.483216 59.595222 2 True
6 0.269634 23.333308 2 True
7 0.878816 28.426519 2 True
8 20.966673 102.497085 1 True
9 0.148058 196.234118 1 True
10 3.339185 39.712124 1 True
11 0.703364 5.222335 1 True
12 0.573977 9.963019 1 True
13 0.494219 45.049742 1 True
14 0.204184 0.029573 1 True
15 0.303809 0.068016 1 True
fit_order
0 16
1 9
2 13
3 14
4 11
5 15
6 10
7 12
8 3
9 6
10 4
11 7
12 5
13 8
14 2
15 1 }
test_new["count"] = 0
performance = predictor_wo_datetime.evaluate(test_new.loc[:, test_new.columns.difference(["datetime"])])
print("The performance indicators are : \n", performance)
/usr/local/lib/python3.7/site-packages/scipy/stats/stats.py:4023: PearsonRConstantInputWarning: An input array is constant; the correlation coefficient is not defined.
warnings.warn(PearsonRConstantInputWarning())
Evaluation: root_mean_squared_error on test data: -256.9979977163498
Note: Scores are always higher_is_better. This metric score can be multiplied by -1 to get the metric value.
Evaluations on test data:
{
"root_mean_squared_error": -256.9979977163498,
"mean_squared_error": -66047.97083021293,
"mean_absolute_error": -189.7036975675971,
"r2": 0.0,
"pearsonr": NaN,
"median_absolute_error": -144.92901611328125
}
The performance indicators are :
{'root_mean_squared_error': -256.9979977163498, 'mean_squared_error': -66047.97083021293, 'mean_absolute_error': -189.7036975675971, 'r2': 0.0, 'pearsonr': nan, 'median_absolute_error': -144.92901611328125}
# Remember to set all negative values to zero
predictions_wo_datetime = predictor_wo_datetime.predict(test_new.loc[:, test_new.columns.difference(["datetime"])])
print('Negative predictions are :', predictions_wo_datetime[predictions_wo_datetime<0])
Negative predictions are : Series([], Name: count, dtype: float32)
# Same submitting predictions
submission_wo_datetime = submission
submission_wo_datetime["count"] = predictions_wo_datetime
submission_wo_datetime.to_csv("submission_new_features_no_datetime.csv", index=False)
!kaggle competitions submit -c bike-sharing-demand -f submission_new_features_no_datetime.csv -m "new features + without datetime + set weather, holiday, season, workingday as categorical data "
100%|█████████████████████████████████████████| 188k/188k [00:00<00:00, 302kB/s] Successfully submitted to Bike Sharing Demand
!kaggle competitions submissions -c bike-sharing-demand | tail -n +1 | head -n 6
fileName date description status publicScore privateScore --------------------------------------- ------------------- ----------------------------------------------------------------------------------------------- -------- ----------- ------------ submission_new_features_no_datetime.csv 2022-10-23 20:47:25 new features + without datetime + set weather, holiday, season, workingday as categorical data complete 0.47409 0.47409 submission_new_hpo.csv 2022-10-23 02:55:42 new features with hyperparameter tuning of GBM and XGBoost complete 0.47866 0.47866 submission_new_hpo.csv 2022-10-23 02:24:32 new features with hyperparameter tuning of GBM and XGBoost complete 0.47866 0.47866 submission_new_hpo.csv 2022-10-19 22:30:48 new features with hyperparameters complete 0.48898 0.48898
1.80119¶hyperparameter and hyperparameter_tune_kwargs arguments.In this first attempt to tune the hyperparameters, I will change two factors :
hyperparameters = 'default'
hyperparameter_tune_kwargs = 'auto'
predictor_new_hpo = TabularPredictor(label="count", problem_type="regression", eval_metric="root_mean_squared_error").fit(
train_data=train_new.loc[:, train_new.columns.difference(["casual","registered"])], time_limit=600, presets="best_quality",
hyperparameters=hyperparameters, hyperparameter_tune_kwargs=hyperparameter_tune_kwargs)
No path specified. Models will be saved in: "AutogluonModels/ag-20221023_211333/"
Presets specified: ['best_quality']
Warning: hyperparameter tuning is currently experimental and may cause the process to hang.
Stack configuration (auto_stack=True): num_stack_levels=1, num_bag_folds=8, num_bag_sets=20
Beginning AutoGluon training ... Time limit = 600s
AutoGluon will save models to "AutogluonModels/ag-20221023_211333/"
AutoGluon Version: 0.5.2
Python Version: 3.7.10
Operating System: Linux
Train Data Rows: 10886
Train Data Columns: 13
Label Column: count
Preprocessing data ...
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 2346.99 MB
Train Data (Original) Memory Usage: 1.13 MB (0.0% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Note: Converting 3 features to boolean dtype as they only contain 2 unique values.
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Fitting DatetimeFeatureGenerator...
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('datetime', []) : 1 | ['datetime']
('float', []) : 3 | ['atemp', 'temp', 'windspeed']
('int', []) : 9 | ['day', 'holiday', 'hour', 'humidity', 'month', ...]
Types of features in processed data (raw dtype, special dtypes):
('float', []) : 3 | ['atemp', 'temp', 'windspeed']
('int', []) : 6 | ['day', 'hour', 'humidity', 'month', 'season', ...]
('int', ['bool']) : 3 | ['holiday', 'workingday', 'year']
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
0.1s = Fit runtime
13 features in original data used to generate 17 features in processed data.
Train Data (Processed) Memory Usage: 1.25 MB (0.1% of available memory)
Data preprocessing and feature engineering runtime = 0.18s ...
AutoGluon will gauge predictive performance using evaluation metric: 'root_mean_squared_error'
This metric's sign has been flipped to adhere to being higher_is_better. The metric score can be multiplied by -1 to get the metric value.
To change this, specify the eval_metric parameter of Predictor()
AutoGluon will fit 2 stack levels (L1 to L2) ...
Fitting 11 L1 models ...
Hyperparameter tuning model: KNeighborsUnif_BAG_L1 ... Tuning model for up to 4.09s of the 599.81s of remaining time.
No hyperparameter search space specified for KNeighborsUnif. Skipping HPO. Will train one model based on the provided hyperparameters.
Warning: Exception caused KNeighborsUnif_BAG_L1 to fail during hyperparameter tuning... Skipping this model.
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/autogluon/core/trainer/abstract_trainer.py", line 1385, in _train_single_full
**model_fit_kwargs
File "/usr/local/lib/python3.7/site-packages/autogluon/core/models/abstract/abstract_model.py", line 1001, in hyperparameter_tune
return self._hyperparameter_tune(hpo_executor=hpo_executor, **kwargs)
File "/usr/local/lib/python3.7/site-packages/autogluon/core/models/ensemble/stacker_ensemble_model.py", line 182, in _hyperparameter_tune
return super()._hyperparameter_tune(X=X, y=y, k_fold=k_fold, hpo_executor=hpo_executor, preprocess_kwargs=preprocess_kwargs, **kwargs)
File "/usr/local/lib/python3.7/site-packages/autogluon/core/models/ensemble/bagged_ensemble_model.py", line 1080, in _hyperparameter_tune
model_path = model_info['path']
TypeError: string indices must be integers
string indices must be integers
Hyperparameter tuning model: KNeighborsDist_BAG_L1 ... Tuning model for up to 4.09s of the 599.53s of remaining time.
No hyperparameter search space specified for KNeighborsDist. Skipping HPO. Will train one model based on the provided hyperparameters.
Warning: Exception caused KNeighborsDist_BAG_L1 to fail during hyperparameter tuning... Skipping this model.
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/autogluon/core/trainer/abstract_trainer.py", line 1385, in _train_single_full
**model_fit_kwargs
File "/usr/local/lib/python3.7/site-packages/autogluon/core/models/abstract/abstract_model.py", line 1001, in hyperparameter_tune
return self._hyperparameter_tune(hpo_executor=hpo_executor, **kwargs)
File "/usr/local/lib/python3.7/site-packages/autogluon/core/models/ensemble/stacker_ensemble_model.py", line 182, in _hyperparameter_tune
return super()._hyperparameter_tune(X=X, y=y, k_fold=k_fold, hpo_executor=hpo_executor, preprocess_kwargs=preprocess_kwargs, **kwargs)
File "/usr/local/lib/python3.7/site-packages/autogluon/core/models/ensemble/bagged_ensemble_model.py", line 1080, in _hyperparameter_tune
model_path = model_info['path']
TypeError: string indices must be integers
string indices must be integers
Hyperparameter tuning model: LightGBMXT_BAG_L1 ... Tuning model for up to 4.09s of the 599.27s of remaining time.
[1000] valid_set's rmse: 35.8966 [2000] valid_set's rmse: 33.8791
Ran out of time, early stopping on iteration 2741. Best iteration is: [2729] valid_set's rmse: 33.4549 Stopping HPO to satisfy time limit... Fitted model: LightGBMXT_BAG_L1/T1 ... -33.4549 = Validation score (-root_mean_squared_error) 3.75s = Training runtime 0.23s = Validation runtime Hyperparameter tuning model: LightGBM_BAG_L1 ... Tuning model for up to 4.09s of the 593.6s of remaining time.
[1000] valid_set's rmse: 33.0497
Stopping HPO to satisfy time limit...
Fitted model: LightGBM_BAG_L1/T1 ...
-32.9844 = Validation score (-root_mean_squared_error)
1.65s = Training runtime
0.07s = Validation runtime
Hyperparameter tuning model: RandomForestMSE_BAG_L1 ... Tuning model for up to 4.09s of the 591.11s of remaining time.
No hyperparameter search space specified for RandomForestMSE. Skipping HPO. Will train one model based on the provided hyperparameters.
Warning: Exception caused RandomForestMSE_BAG_L1 to fail during hyperparameter tuning... Skipping this model.
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/autogluon/core/trainer/abstract_trainer.py", line 1385, in _train_single_full
**model_fit_kwargs
File "/usr/local/lib/python3.7/site-packages/autogluon/core/models/abstract/abstract_model.py", line 1001, in hyperparameter_tune
return self._hyperparameter_tune(hpo_executor=hpo_executor, **kwargs)
File "/usr/local/lib/python3.7/site-packages/autogluon/core/models/ensemble/stacker_ensemble_model.py", line 182, in _hyperparameter_tune
return super()._hyperparameter_tune(X=X, y=y, k_fold=k_fold, hpo_executor=hpo_executor, preprocess_kwargs=preprocess_kwargs, **kwargs)
File "/usr/local/lib/python3.7/site-packages/autogluon/core/models/ensemble/bagged_ensemble_model.py", line 1080, in _hyperparameter_tune
model_path = model_info['path']
TypeError: string indices must be integers
string indices must be integers
Hyperparameter tuning model: CatBoost_BAG_L1 ... Tuning model for up to 4.09s of the 575.31s of remaining time.
Ran out of time, early stopping on iteration 1026.
Stopping HPO to satisfy time limit...
Fitted model: CatBoost_BAG_L1/T1 ...
-34.0167 = Validation score (-root_mean_squared_error)
3.2s = Training runtime
0.0s = Validation runtime
Hyperparameter tuning model: ExtraTreesMSE_BAG_L1 ... Tuning model for up to 4.09s of the 571.78s of remaining time.
No hyperparameter search space specified for ExtraTreesMSE. Skipping HPO. Will train one model based on the provided hyperparameters.
Warning: Exception caused ExtraTreesMSE_BAG_L1 to fail during hyperparameter tuning... Skipping this model.
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/autogluon/core/trainer/abstract_trainer.py", line 1385, in _train_single_full
**model_fit_kwargs
File "/usr/local/lib/python3.7/site-packages/autogluon/core/models/abstract/abstract_model.py", line 1001, in hyperparameter_tune
return self._hyperparameter_tune(hpo_executor=hpo_executor, **kwargs)
File "/usr/local/lib/python3.7/site-packages/autogluon/core/models/ensemble/stacker_ensemble_model.py", line 182, in _hyperparameter_tune
return super()._hyperparameter_tune(X=X, y=y, k_fold=k_fold, hpo_executor=hpo_executor, preprocess_kwargs=preprocess_kwargs, **kwargs)
File "/usr/local/lib/python3.7/site-packages/autogluon/core/models/ensemble/bagged_ensemble_model.py", line 1080, in _hyperparameter_tune
model_path = model_info['path']
TypeError: string indices must be integers
string indices must be integers
Hyperparameter tuning model: NeuralNetFastAI_BAG_L1 ... Tuning model for up to 4.09s of the 563.19s of remaining time.
2022-10-23 21:14:13,326 WARNING services.py:2013 -- WARNING: The object store is using /tmp instead of /dev/shm because /dev/shm has only 416284672 bytes available. This will harm performance! You may be able to free up space by deleting files in /dev/shm. If you are inside a Docker container, you can increase /dev/shm size by passing '--shm-size=0.75gb' to 'docker run' (or add it to the run_options list in a Ray cluster config). Make sure to set this to more than 30% of available RAM.
2022-10-23 21:14:16,738 ERROR syncer.py:147 -- Log sync requires rsync to be installed.
NaN or Inf found in input tensor.
2022-10-23 21:14:21,238 INFO stopper.py:364 -- Reached timeout of 3.270924513194561 seconds. Stopping all trials.
Hyperparameter tuning model: XGBoost_BAG_L1 ... Tuning model for up to 4.09s of the 552.04s of remaining time.
Stopping HPO to satisfy time limit...
Fitted model: XGBoost_BAG_L1/T1 ...
-33.5839 = Validation score (-root_mean_squared_error)
3.77s = Training runtime
0.03s = Validation runtime
Hyperparameter tuning model: NeuralNetTorch_BAG_L1 ... Tuning model for up to 4.09s of the 547.76s of remaining time.
NaN or Inf found in input tensor.
2022-10-23 21:14:31,574 INFO stopper.py:364 -- Reached timeout of 3.270924513194561 seconds. Stopping all trials.
Fitting model: LightGBMLarge_BAG_L1 ... Training model for up to 4.09s of the 541.71s of remaining time.
Fitting 1 child models (S1F1 - S1F1) | Fitting with ParallelLocalFoldFittingStrategy
-33.1828 = Validation score (-root_mean_squared_error)
4.83s = Training runtime
0.1s = Validation runtime
Fitting model: LightGBMXT_BAG_L1/T1 ... Training model for up to 333.91s of the 533.94s of remaining time.
Fitting 7 child models (S1F2 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-34.5195 = Validation score (-root_mean_squared_error)
86.97s = Training runtime
10.64s = Validation runtime
Fitting model: LightGBM_BAG_L1/T1 ... Training model for up to 246.4s of the 446.44s of remaining time.
Fitting 7 child models (S1F2 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-33.9992 = Validation score (-root_mean_squared_error)
42.88s = Training runtime
3.05s = Validation runtime
Fitting model: CatBoost_BAG_L1/T1 ... Training model for up to 201.41s of the 401.45s of remaining time.
Fitting 7 child models (S1F2 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-33.5594 = Validation score (-root_mean_squared_error)
173.38s = Training runtime
0.1s = Validation runtime
Fitting model: XGBoost_BAG_L1/T1 ... Training model for up to 28.04s of the 228.08s of remaining time.
Fitting 7 child models (S1F2 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-34.4226 = Validation score (-root_mean_squared_error)
33.25s = Training runtime
0.58s = Validation runtime
Completed 1/20 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L2 ... Training model for up to 360.0s of the 195.44s of remaining time.
-32.2607 = Validation score (-root_mean_squared_error)
0.29s = Training runtime
0.0s = Validation runtime
Fitting 9 L2 models ...
Hyperparameter tuning model: LightGBMXT_BAG_L2 ... Tuning model for up to 2.44s of the 195.06s of remaining time.
Stopping HPO to satisfy time limit...
Fitted model: LightGBMXT_BAG_L2/T1 ...
-37.3993 = Validation score (-root_mean_squared_error)
1.03s = Training runtime
0.02s = Validation runtime
Hyperparameter tuning model: LightGBM_BAG_L2 ... Tuning model for up to 2.44s of the 193.65s of remaining time.
Ran out of time, early stopping on iteration 311. Best iteration is:
[60] valid_set's rmse: 36.8695
Stopping HPO to satisfy time limit...
Fitted model: LightGBM_BAG_L2/T1 ...
-36.684 = Validation score (-root_mean_squared_error)
0.81s = Training runtime
0.01s = Validation runtime
Fitted model: LightGBM_BAG_L2/T2 ...
-36.8695 = Validation score (-root_mean_squared_error)
1.01s = Training runtime
0.01s = Validation runtime
Hyperparameter tuning model: RandomForestMSE_BAG_L2 ... Tuning model for up to 2.44s of the 191.31s of remaining time.
No hyperparameter search space specified for RandomForestMSE. Skipping HPO. Will train one model based on the provided hyperparameters.
Warning: Exception caused RandomForestMSE_BAG_L2 to fail during hyperparameter tuning... Skipping this model.
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/autogluon/core/trainer/abstract_trainer.py", line 1385, in _train_single_full
**model_fit_kwargs
File "/usr/local/lib/python3.7/site-packages/autogluon/core/models/abstract/abstract_model.py", line 1001, in hyperparameter_tune
return self._hyperparameter_tune(hpo_executor=hpo_executor, **kwargs)
File "/usr/local/lib/python3.7/site-packages/autogluon/core/models/ensemble/stacker_ensemble_model.py", line 182, in _hyperparameter_tune
return super()._hyperparameter_tune(X=X, y=y, k_fold=k_fold, hpo_executor=hpo_executor, preprocess_kwargs=preprocess_kwargs, **kwargs)
File "/usr/local/lib/python3.7/site-packages/autogluon/core/models/ensemble/bagged_ensemble_model.py", line 1080, in _hyperparameter_tune
model_path = model_info['path']
TypeError: string indices must be integers
string indices must be integers
Hyperparameter tuning model: CatBoost_BAG_L2 ... Tuning model for up to 2.44s of the 164.68s of remaining time.
Ran out of time, early stopping on iteration 441.
Stopping HPO to satisfy time limit...
Fitted model: CatBoost_BAG_L2/T1 ...
-36.0055 = Validation score (-root_mean_squared_error)
1.86s = Training runtime
0.0s = Validation runtime
Hyperparameter tuning model: ExtraTreesMSE_BAG_L2 ... Tuning model for up to 2.44s of the 162.52s of remaining time.
No hyperparameter search space specified for ExtraTreesMSE. Skipping HPO. Will train one model based on the provided hyperparameters.
Warning: Exception caused ExtraTreesMSE_BAG_L2 to fail during hyperparameter tuning... Skipping this model.
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/autogluon/core/trainer/abstract_trainer.py", line 1385, in _train_single_full
**model_fit_kwargs
File "/usr/local/lib/python3.7/site-packages/autogluon/core/models/abstract/abstract_model.py", line 1001, in hyperparameter_tune
return self._hyperparameter_tune(hpo_executor=hpo_executor, **kwargs)
File "/usr/local/lib/python3.7/site-packages/autogluon/core/models/ensemble/stacker_ensemble_model.py", line 182, in _hyperparameter_tune
return super()._hyperparameter_tune(X=X, y=y, k_fold=k_fold, hpo_executor=hpo_executor, preprocess_kwargs=preprocess_kwargs, **kwargs)
File "/usr/local/lib/python3.7/site-packages/autogluon/core/models/ensemble/bagged_ensemble_model.py", line 1080, in _hyperparameter_tune
model_path = model_info['path']
TypeError: string indices must be integers
string indices must be integers
Hyperparameter tuning model: NeuralNetFastAI_BAG_L2 ... Tuning model for up to 2.44s of the 151.54s of remaining time.
NaN or Inf found in input tensor.
2022-10-23 21:21:08,142 INFO stopper.py:364 -- Reached timeout of 1.950755183696747 seconds. Stopping all trials.
Hyperparameter tuning model: XGBoost_BAG_L2 ... Tuning model for up to 2.44s of the 145.0s of remaining time.
Stopping HPO to satisfy time limit...
Fitted model: XGBoost_BAG_L2/T1 ...
-37.0546 = Validation score (-root_mean_squared_error)
1.51s = Training runtime
0.01s = Validation runtime
Hyperparameter tuning model: NeuralNetTorch_BAG_L2 ... Tuning model for up to 2.44s of the 143.11s of remaining time.
NaN or Inf found in input tensor.
2022-10-23 21:21:16,165 INFO stopper.py:364 -- Reached timeout of 1.950755183696747 seconds. Stopping all trials.
Fitting model: LightGBMLarge_BAG_L2 ... Training model for up to 2.44s of the 136.94s of remaining time.
Fitting 1 child models (S1F1 - S1F1) | Fitting with ParallelLocalFoldFittingStrategy
-37.4264 = Validation score (-root_mean_squared_error)
3.01s = Training runtime
0.01s = Validation runtime
Fitting model: LightGBMXT_BAG_L2/T1 ... Training model for up to 131.58s of the 131.57s of remaining time.
Fitting 7 child models (S1F2 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-33.4906 = Validation score (-root_mean_squared_error)
20.31s = Training runtime
0.18s = Validation runtime
Fitting model: LightGBM_BAG_L2/T1 ... Training model for up to 109.73s of the 109.71s of remaining time.
Fitting 7 child models (S1F2 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-32.9263 = Validation score (-root_mean_squared_error)
19.26s = Training runtime
0.08s = Validation runtime
Fitting model: LightGBM_BAG_L2/T2 ... Training model for up to 88.18s of the 88.17s of remaining time.
Fitting 7 child models (S1F2 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-33.6334 = Validation score (-root_mean_squared_error)
21.86s = Training runtime
0.08s = Validation runtime
Fitting model: CatBoost_BAG_L2/T1 ... Training model for up to 64.37s of the 64.35s of remaining time.
Fitting 7 child models (S1F2 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-32.5454 = Validation score (-root_mean_squared_error)
24.35s = Training runtime
0.07s = Validation runtime
Fitting model: XGBoost_BAG_L2/T1 ... Training model for up to 38.65s of the 38.64s of remaining time.
Fitting 7 child models (S1F2 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-33.5402 = Validation score (-root_mean_squared_error)
19.53s = Training runtime
0.12s = Validation runtime
Fitting model: LightGBMLarge_BAG_L2 ... Training model for up to 17.54s of the 17.53s of remaining time.
Fitting 7 child models (S1F2 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-33.8536 = Validation score (-root_mean_squared_error)
30.66s = Training runtime
0.16s = Validation runtime
Completed 1/20 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L3 ... Training model for up to 360.0s of the -13.28s of remaining time.
-32.4574 = Validation score (-root_mean_squared_error)
0.42s = Training runtime
0.0s = Validation runtime
AutoGluon training complete, total runtime = 613.92s ... Best model: "WeightedEnsemble_L2"
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("AutogluonModels/ag-20221023_211333/")
predictor_new_hpo.fit_summary()
*** Summary of fit() ***
Estimated performance of each model:
model score_val pred_time_val fit_time pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order
0 WeightedEnsemble_L3 -32.303080 13.229172 448.804626 0.000901 0.387506 3 True 14
1 WeightedEnsemble_L2 -32.316131 12.340177 337.509858 0.001140 0.295377 2 True 6
2 CatBoost_BAG_L2/T1 -32.518530 12.449266 372.622336 0.110229 35.407855 2 True 11
3 LightGBM_BAG_L2/T1 -32.610988 12.477141 356.246051 0.138104 19.031570 2 True 9
4 XGBoost_BAG_L2/T1 -32.942820 12.481319 351.933692 0.142282 14.719211 2 True 12
5 LightGBMXT_BAG_L2/T1 -33.346366 12.574636 357.023190 0.235599 19.808709 2 True 7
6 LightGBM_BAG_L2/T2 -33.349182 12.474386 360.440445 0.135350 23.225964 2 True 10
7 LightGBMLarge_BAG_L1 -33.687603 0.111603 4.693507 0.111603 4.693507 1 True 5
8 LightGBM_BAG_L1/T1 -33.915986 2.554125 39.672345 2.554125 39.672345 1 True 2
9 LightGBMXT_BAG_L2/T2 -34.017983 12.602056 359.449775 0.263019 22.235294 2 True 8
10 LightGBMXT_BAG_L1/T1 -34.337455 9.090987 84.448378 9.090987 84.448378 1 True 1
11 XGBoost_BAG_L1/T1 -34.552675 0.541804 32.540259 0.541804 32.540259 1 True 4
12 CatBoost_BAG_L1/T1 -34.707230 0.152121 180.553499 0.152121 180.553499 1 True 3
13 LightGBMLarge_BAG_L2 -36.011828 12.358922 340.162013 0.019885 2.947532 2 True 13
Number of models trained: 14
Types of models trained:
{'StackerEnsembleModel_XGBoost', 'StackerEnsembleModel_CatBoost', 'StackerEnsembleModel_LGB', 'WeightedEnsembleModel'}
Bagging used: True (with 8 folds)
Multi-layer stack-ensembling used: True (with 3 levels)
Feature Metadata (Processed):
(raw dtype, special dtypes):
('category', []) : 2 | ['season', 'weather']
('float', []) : 3 | ['atemp', 'temp', 'windspeed']
('int', []) : 4 | ['day', 'hour', 'humidity', 'month']
('int', ['bool']) : 3 | ['holiday', 'workingday', 'year']
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
Plot summary of models saved to file: AutogluonModels/ag-20221019_223247/SummaryOfModels.html
*** End of fit() summary ***
{'model_types': {'LightGBMXT_BAG_L1/T1': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L1/T1': 'StackerEnsembleModel_LGB',
'CatBoost_BAG_L1/T1': 'StackerEnsembleModel_CatBoost',
'XGBoost_BAG_L1/T1': 'StackerEnsembleModel_XGBoost',
'LightGBMLarge_BAG_L1': 'StackerEnsembleModel_LGB',
'WeightedEnsemble_L2': 'WeightedEnsembleModel',
'LightGBMXT_BAG_L2/T1': 'StackerEnsembleModel_LGB',
'LightGBMXT_BAG_L2/T2': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L2/T1': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L2/T2': 'StackerEnsembleModel_LGB',
'CatBoost_BAG_L2/T1': 'StackerEnsembleModel_CatBoost',
'XGBoost_BAG_L2/T1': 'StackerEnsembleModel_XGBoost',
'LightGBMLarge_BAG_L2': 'StackerEnsembleModel_LGB',
'WeightedEnsemble_L3': 'WeightedEnsembleModel'},
'model_performance': {'LightGBMXT_BAG_L1/T1': -34.33745545082705,
'LightGBM_BAG_L1/T1': -33.91598594261488,
'CatBoost_BAG_L1/T1': -34.70722968272442,
'XGBoost_BAG_L1/T1': -34.552674851674695,
'LightGBMLarge_BAG_L1': -33.687602754409426,
'WeightedEnsemble_L2': -32.31613067415643,
'LightGBMXT_BAG_L2/T1': -33.34636635930924,
'LightGBMXT_BAG_L2/T2': -34.01798318498792,
'LightGBM_BAG_L2/T1': -32.61098830441786,
'LightGBM_BAG_L2/T2': -33.34918196644897,
'CatBoost_BAG_L2/T1': -32.51853023440451,
'XGBoost_BAG_L2/T1': -32.9428199805874,
'LightGBMLarge_BAG_L2': -36.01182792599411,
'WeightedEnsemble_L3': -32.303080325729134},
'model_best': 'WeightedEnsemble_L3',
'model_paths': {'LightGBMXT_BAG_L1/T1': 'AutogluonModels/ag-20221019_223247/models/LightGBMXT_BAG_L1/T1/',
'LightGBM_BAG_L1/T1': 'AutogluonModels/ag-20221019_223247/models/LightGBM_BAG_L1/T1/',
'CatBoost_BAG_L1/T1': 'AutogluonModels/ag-20221019_223247/models/CatBoost_BAG_L1/T1/',
'XGBoost_BAG_L1/T1': 'AutogluonModels/ag-20221019_223247/models/XGBoost_BAG_L1/T1/',
'LightGBMLarge_BAG_L1': 'AutogluonModels/ag-20221019_223247/models/LightGBMLarge_BAG_L1/',
'WeightedEnsemble_L2': 'AutogluonModels/ag-20221019_223247/models/WeightedEnsemble_L2/',
'LightGBMXT_BAG_L2/T1': 'AutogluonModels/ag-20221019_223247/models/LightGBMXT_BAG_L2/T1/',
'LightGBMXT_BAG_L2/T2': 'AutogluonModels/ag-20221019_223247/models/LightGBMXT_BAG_L2/T2/',
'LightGBM_BAG_L2/T1': 'AutogluonModels/ag-20221019_223247/models/LightGBM_BAG_L2/T1/',
'LightGBM_BAG_L2/T2': 'AutogluonModels/ag-20221019_223247/models/LightGBM_BAG_L2/T2/',
'CatBoost_BAG_L2/T1': 'AutogluonModels/ag-20221019_223247/models/CatBoost_BAG_L2/T1/',
'XGBoost_BAG_L2/T1': 'AutogluonModels/ag-20221019_223247/models/XGBoost_BAG_L2/T1/',
'LightGBMLarge_BAG_L2': 'AutogluonModels/ag-20221019_223247/models/LightGBMLarge_BAG_L2/',
'WeightedEnsemble_L3': 'AutogluonModels/ag-20221019_223247/models/WeightedEnsemble_L3/'},
'model_fit_times': {'LightGBMXT_BAG_L1/T1': 84.44837832450867,
'LightGBM_BAG_L1/T1': 39.67234492301941,
'CatBoost_BAG_L1/T1': 180.55349850654602,
'XGBoost_BAG_L1/T1': 32.54025936126709,
'LightGBMLarge_BAG_L1': 4.693507432937622,
'WeightedEnsemble_L2': 0.2953770160675049,
'LightGBMXT_BAG_L2/T1': 19.808709144592285,
'LightGBMXT_BAG_L2/T2': 22.235293865203857,
'LightGBM_BAG_L2/T1': 19.031569957733154,
'LightGBM_BAG_L2/T2': 23.225964069366455,
'CatBoost_BAG_L2/T1': 35.40785527229309,
'XGBoost_BAG_L2/T1': 14.719210863113403,
'LightGBMLarge_BAG_L2': 2.9475321769714355,
'WeightedEnsemble_L3': 0.38750624656677246},
'model_pred_times': {'LightGBMXT_BAG_L1/T1': 9.090986967086792,
'LightGBM_BAG_L1/T1': 2.554124593734741,
'CatBoost_BAG_L1/T1': 0.15212106704711914,
'XGBoost_BAG_L1/T1': 0.5418040752410889,
'LightGBMLarge_BAG_L1': 0.1116032600402832,
'WeightedEnsemble_L2': 0.0011401176452636719,
'LightGBMXT_BAG_L2/T1': 0.23559927940368652,
'LightGBMXT_BAG_L2/T2': 0.2630190849304199,
'LightGBM_BAG_L2/T1': 0.13810420036315918,
'LightGBM_BAG_L2/T2': 0.13534951210021973,
'CatBoost_BAG_L2/T1': 0.1102294921875,
'XGBoost_BAG_L2/T1': 0.14228224754333496,
'LightGBMLarge_BAG_L2': 0.01988506317138672,
'WeightedEnsemble_L3': 0.0009007453918457031},
'num_bag_folds': 8,
'max_stack_level': 3,
'model_hyperparams': {'LightGBMXT_BAG_L1/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L1/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'CatBoost_BAG_L1/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'XGBoost_BAG_L1/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBMLarge_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'WeightedEnsemble_L2': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBMXT_BAG_L2/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBMXT_BAG_L2/T2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L2/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L2/T2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'CatBoost_BAG_L2/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'XGBoost_BAG_L2/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBMLarge_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'WeightedEnsemble_L3': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True}},
'leaderboard': model score_val pred_time_val fit_time \
0 WeightedEnsemble_L3 -32.303080 13.229172 448.804626
1 WeightedEnsemble_L2 -32.316131 12.340177 337.509858
2 CatBoost_BAG_L2/T1 -32.518530 12.449266 372.622336
3 LightGBM_BAG_L2/T1 -32.610988 12.477141 356.246051
4 XGBoost_BAG_L2/T1 -32.942820 12.481319 351.933692
5 LightGBMXT_BAG_L2/T1 -33.346366 12.574636 357.023190
6 LightGBM_BAG_L2/T2 -33.349182 12.474386 360.440445
7 LightGBMLarge_BAG_L1 -33.687603 0.111603 4.693507
8 LightGBM_BAG_L1/T1 -33.915986 2.554125 39.672345
9 LightGBMXT_BAG_L2/T2 -34.017983 12.602056 359.449775
10 LightGBMXT_BAG_L1/T1 -34.337455 9.090987 84.448378
11 XGBoost_BAG_L1/T1 -34.552675 0.541804 32.540259
12 CatBoost_BAG_L1/T1 -34.707230 0.152121 180.553499
13 LightGBMLarge_BAG_L2 -36.011828 12.358922 340.162013
pred_time_val_marginal fit_time_marginal stack_level can_infer \
0 0.000901 0.387506 3 True
1 0.001140 0.295377 2 True
2 0.110229 35.407855 2 True
3 0.138104 19.031570 2 True
4 0.142282 14.719211 2 True
5 0.235599 19.808709 2 True
6 0.135350 23.225964 2 True
7 0.111603 4.693507 1 True
8 2.554125 39.672345 1 True
9 0.263019 22.235294 2 True
10 9.090987 84.448378 1 True
11 0.541804 32.540259 1 True
12 0.152121 180.553499 1 True
13 0.019885 2.947532 2 True
fit_order
0 14
1 6
2 11
3 9
4 12
5 7
6 10
7 5
8 2
9 8
10 1
11 4
12 3
13 13 }
Let's plot scores of the top performers of the tested models.
predictor_new_hpo.leaderboard(silent=True).plot(kind="bar", x="model", y="score_val")
<AxesSubplot:xlabel='model'>
# Remember to set all negative values to zero
test_new["count"] = 0
performance_new_hpo = predictor_new_hpo.evaluate(test_new)
print("The performance indicators are : \n", performance_new_hpo)
/usr/local/lib/python3.7/site-packages/scipy/stats/stats.py:4023: PearsonRConstantInputWarning: An input array is constant; the correlation coefficient is not defined.
warnings.warn(PearsonRConstantInputWarning())
Evaluation: root_mean_squared_error on test data: -256.7605465791283
Note: Scores are always higher_is_better. This metric score can be multiplied by -1 to get the metric value.
Evaluations on test data:
{
"root_mean_squared_error": -256.7605465791283,
"mean_squared_error": -65925.97827961271,
"mean_absolute_error": -190.33855804882228,
"r2": 0.0,
"pearsonr": NaN,
"median_absolute_error": -149.23776245117188
}
The performance indicators are :
{'root_mean_squared_error': -256.7605465791283, 'mean_squared_error': -65925.97827961271, 'mean_absolute_error': -190.33855804882228, 'r2': 0.0, 'pearsonr': nan, 'median_absolute_error': -149.23776245117188}
# Remember to set all negative values to zero
predictions_new_features = predictor_new_hpo.predict(test_new)
predictions_new_features
0 12.811199
1 7.545908
2 7.033610
3 6.769294
4 6.894789
...
6488 327.683197
6489 218.574661
6490 148.141708
6491 97.523758
6492 48.462696
Name: count, Length: 6493, dtype: float32
predictions_new_features[predictions_new_features<0]
Series([], Name: count, dtype: float32)
# Same submitting predictions
submission_new_hpo = submission
submission_new_hpo["count"] = predictions_new_features
submission_new_hpo.to_csv("submission_new_hpo.csv", index=False)
!kaggle competitions submit -c bike-sharing-demand -f submission_new_hpo.csv -m "new features with hyperparameters"
100%|█████████████████████████████████████████| 188k/188k [00:00<00:00, 349kB/s] Successfully submitted to Bike Sharing Demand
!kaggle competitions submissions -c bike-sharing-demand | tail -n +1 | head -n 6
fileName date description status publicScore privateScore --------------------------- ------------------- -------------------------------------------------------- -------- ----------- ------------ submission_new_hpo.csv 2022-10-19 22:30:48 new features with hyperparameters complete 0.48898 0.48898 submission_new_features.csv 2022-10-19 20:23:34 new features + set weather, holiday, season, workingday complete 1.80119 1.80119 submission_new_features.csv 2022-10-19 19:25:25 new features + set weather, holiday, season, workingday complete 1.80895 1.80895 submission.csv 2022-10-19 18:30:41 3nd raw submission complete 1.80895 1.80895
0.48898¶I will hereafter tune the hyperparameters of some of the models tested by autogluon. I will specifically tune models that were among the top 10 performers in the first hyperparameter tuning attempt. Therefore, in this section, I will test different parameters for LightGBM and XGBoost. Since CATboost usally performs well with default parameters, I will not tune it.
gbm_config = [{'num_boost_round': 100}, # number of boosting rounds (controls training time of GBM models)
#'num_leaves': ag.space.Int(lower=10, upper=50), # number of leaves in trees (integer hyperparameter)
{'num_leaves': 70},
{'num_leaves': 100},
{'num_leaves':150}]
xgb_config = [{'eta ':0.1},
{'eta':0.2},
{'n_estimators':50},
{'n_estimators':100},
{'n_estimators':150}]
#dict = {‘RF’: [{‘criterion’: ‘gini’}, {‘criterion’: ‘entropy’}]}
hyperparameters = hyperparameters = {
'GBM': gbm_config,
#'CAT':cat_config,
'XGB': xgb_config
}
#hyperparameter_tune_kwargs = 'auto'
hyperparameter_tune_kwargs = {'searcher': 'auto'}
predictor_new_hpo = TabularPredictor(label="count", problem_type="regression", eval_metric="root_mean_squared_error").fit(
train_data=train_new.loc[:, train_new.columns.difference(["casual","registered"])], time_limit=600, presets="best_quality",
hyperparameters=hyperparameters, hyperparameter_tune_kwargs=hyperparameter_tune_kwargs)
No path specified. Models will be saved in: "AutogluonModels/ag-20221023_025044/"
Presets specified: ['best_quality']
Warning: hyperparameter tuning is currently experimental and may cause the process to hang.
Stack configuration (auto_stack=True): num_stack_levels=1, num_bag_folds=8, num_bag_sets=20
Beginning AutoGluon training ... Time limit = 600s
AutoGluon will save models to "AutogluonModels/ag-20221023_025044/"
AutoGluon Version: 0.5.2
Python Version: 3.7.10
Operating System: Linux
Train Data Rows: 10886
Train Data Columns: 13
Label Column: count
Preprocessing data ...
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 2458.08 MB
Train Data (Original) Memory Usage: 0.83 MB (0.0% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Note: Converting 3 features to boolean dtype as they only contain 2 unique values.
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Fitting CategoryFeatureGenerator...
Fitting CategoryMemoryMinimizeFeatureGenerator...
Fitting DatetimeFeatureGenerator...
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('category', []) : 4 | ['holiday', 'season', 'weather', 'workingday']
('datetime', []) : 1 | ['datetime']
('float', []) : 3 | ['atemp', 'temp', 'windspeed']
('int', []) : 5 | ['day', 'hour', 'humidity', 'month', 'year']
Types of features in processed data (raw dtype, special dtypes):
('category', []) : 2 | ['season', 'weather']
('float', []) : 3 | ['atemp', 'temp', 'windspeed']
('int', []) : 4 | ['day', 'hour', 'humidity', 'month']
('int', ['bool']) : 3 | ['holiday', 'workingday', 'year']
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
0.1s = Fit runtime
13 features in original data used to generate 17 features in processed data.
Train Data (Processed) Memory Usage: 1.1 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.19s ...
AutoGluon will gauge predictive performance using evaluation metric: 'root_mean_squared_error'
This metric's sign has been flipped to adhere to being higher_is_better. The metric score can be multiplied by -1 to get the metric value.
To change this, specify the eval_metric parameter of Predictor()
AutoGluon will fit 2 stack levels (L1 to L2) ...
Fitting 9 L1 models ...
Hyperparameter tuning model: LightGBM_BAG_L1 ... Tuning model for up to 5.0s of the 599.81s of remaining time.
Warning: Exception caused LightGBM_BAG_L1 to fail during hyperparameter tuning... Skipping this model.
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/autogluon/core/trainer/abstract_trainer.py", line 1385, in _train_single_full
**model_fit_kwargs
File "/usr/local/lib/python3.7/site-packages/autogluon/core/models/abstract/abstract_model.py", line 996, in hyperparameter_tune
hpo_executor.initialize(hyperparameter_tune_kwargs, default_num_trials=default_num_trials, time_limit=time_limit)
File "/usr/local/lib/python3.7/site-packages/autogluon/core/hpo/executors.py", line 318, in initialize
hyperparameter_tune_kwargs = scheduler_factory(hyperparameter_tune_kwargs, num_trials=num_trials, nthreads_per_trial='auto', ngpus_per_trial='auto')
File "/usr/local/lib/python3.7/site-packages/autogluon/core/scheduler/scheduler_factory.py", line 76, in scheduler_factory
raise ValueError(f"Required key 'scheduler' is not present in hyperparameter_tune_kwargs: {hyperparameter_tune_kwargs}")
ValueError: Required key 'scheduler' is not present in hyperparameter_tune_kwargs: {'searcher': 'auto'}
Required key 'scheduler' is not present in hyperparameter_tune_kwargs: {'searcher': 'auto'}
Hyperparameter tuning model: LightGBM_2_BAG_L1 ... Tuning model for up to 5.0s of the 599.79s of remaining time.
Warning: Exception caused LightGBM_2_BAG_L1 to fail during hyperparameter tuning... Skipping this model.
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/autogluon/core/trainer/abstract_trainer.py", line 1385, in _train_single_full
**model_fit_kwargs
File "/usr/local/lib/python3.7/site-packages/autogluon/core/models/abstract/abstract_model.py", line 996, in hyperparameter_tune
hpo_executor.initialize(hyperparameter_tune_kwargs, default_num_trials=default_num_trials, time_limit=time_limit)
File "/usr/local/lib/python3.7/site-packages/autogluon/core/hpo/executors.py", line 318, in initialize
hyperparameter_tune_kwargs = scheduler_factory(hyperparameter_tune_kwargs, num_trials=num_trials, nthreads_per_trial='auto', ngpus_per_trial='auto')
File "/usr/local/lib/python3.7/site-packages/autogluon/core/scheduler/scheduler_factory.py", line 76, in scheduler_factory
raise ValueError(f"Required key 'scheduler' is not present in hyperparameter_tune_kwargs: {hyperparameter_tune_kwargs}")
ValueError: Required key 'scheduler' is not present in hyperparameter_tune_kwargs: {'searcher': 'auto'}
Required key 'scheduler' is not present in hyperparameter_tune_kwargs: {'searcher': 'auto'}
Hyperparameter tuning model: LightGBM_3_BAG_L1 ... Tuning model for up to 5.0s of the 599.77s of remaining time.
Warning: Exception caused LightGBM_3_BAG_L1 to fail during hyperparameter tuning... Skipping this model.
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/autogluon/core/trainer/abstract_trainer.py", line 1385, in _train_single_full
**model_fit_kwargs
File "/usr/local/lib/python3.7/site-packages/autogluon/core/models/abstract/abstract_model.py", line 996, in hyperparameter_tune
hpo_executor.initialize(hyperparameter_tune_kwargs, default_num_trials=default_num_trials, time_limit=time_limit)
File "/usr/local/lib/python3.7/site-packages/autogluon/core/hpo/executors.py", line 318, in initialize
hyperparameter_tune_kwargs = scheduler_factory(hyperparameter_tune_kwargs, num_trials=num_trials, nthreads_per_trial='auto', ngpus_per_trial='auto')
File "/usr/local/lib/python3.7/site-packages/autogluon/core/scheduler/scheduler_factory.py", line 76, in scheduler_factory
raise ValueError(f"Required key 'scheduler' is not present in hyperparameter_tune_kwargs: {hyperparameter_tune_kwargs}")
ValueError: Required key 'scheduler' is not present in hyperparameter_tune_kwargs: {'searcher': 'auto'}
Required key 'scheduler' is not present in hyperparameter_tune_kwargs: {'searcher': 'auto'}
Hyperparameter tuning model: LightGBM_4_BAG_L1 ... Tuning model for up to 5.0s of the 599.75s of remaining time.
Warning: Exception caused LightGBM_4_BAG_L1 to fail during hyperparameter tuning... Skipping this model.
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/autogluon/core/trainer/abstract_trainer.py", line 1385, in _train_single_full
**model_fit_kwargs
File "/usr/local/lib/python3.7/site-packages/autogluon/core/models/abstract/abstract_model.py", line 996, in hyperparameter_tune
hpo_executor.initialize(hyperparameter_tune_kwargs, default_num_trials=default_num_trials, time_limit=time_limit)
File "/usr/local/lib/python3.7/site-packages/autogluon/core/hpo/executors.py", line 318, in initialize
hyperparameter_tune_kwargs = scheduler_factory(hyperparameter_tune_kwargs, num_trials=num_trials, nthreads_per_trial='auto', ngpus_per_trial='auto')
File "/usr/local/lib/python3.7/site-packages/autogluon/core/scheduler/scheduler_factory.py", line 76, in scheduler_factory
raise ValueError(f"Required key 'scheduler' is not present in hyperparameter_tune_kwargs: {hyperparameter_tune_kwargs}")
ValueError: Required key 'scheduler' is not present in hyperparameter_tune_kwargs: {'searcher': 'auto'}
Required key 'scheduler' is not present in hyperparameter_tune_kwargs: {'searcher': 'auto'}
Hyperparameter tuning model: XGBoost_BAG_L1 ... Tuning model for up to 5.0s of the 599.73s of remaining time.
Warning: Exception caused XGBoost_BAG_L1 to fail during hyperparameter tuning... Skipping this model.
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/autogluon/core/trainer/abstract_trainer.py", line 1385, in _train_single_full
**model_fit_kwargs
File "/usr/local/lib/python3.7/site-packages/autogluon/core/models/abstract/abstract_model.py", line 996, in hyperparameter_tune
hpo_executor.initialize(hyperparameter_tune_kwargs, default_num_trials=default_num_trials, time_limit=time_limit)
File "/usr/local/lib/python3.7/site-packages/autogluon/core/hpo/executors.py", line 318, in initialize
hyperparameter_tune_kwargs = scheduler_factory(hyperparameter_tune_kwargs, num_trials=num_trials, nthreads_per_trial='auto', ngpus_per_trial='auto')
File "/usr/local/lib/python3.7/site-packages/autogluon/core/scheduler/scheduler_factory.py", line 76, in scheduler_factory
raise ValueError(f"Required key 'scheduler' is not present in hyperparameter_tune_kwargs: {hyperparameter_tune_kwargs}")
ValueError: Required key 'scheduler' is not present in hyperparameter_tune_kwargs: {'searcher': 'auto'}
Required key 'scheduler' is not present in hyperparameter_tune_kwargs: {'searcher': 'auto'}
Hyperparameter tuning model: XGBoost_2_BAG_L1 ... Tuning model for up to 5.0s of the 599.71s of remaining time.
Warning: Exception caused XGBoost_2_BAG_L1 to fail during hyperparameter tuning... Skipping this model.
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/autogluon/core/trainer/abstract_trainer.py", line 1385, in _train_single_full
**model_fit_kwargs
File "/usr/local/lib/python3.7/site-packages/autogluon/core/models/abstract/abstract_model.py", line 996, in hyperparameter_tune
hpo_executor.initialize(hyperparameter_tune_kwargs, default_num_trials=default_num_trials, time_limit=time_limit)
File "/usr/local/lib/python3.7/site-packages/autogluon/core/hpo/executors.py", line 318, in initialize
hyperparameter_tune_kwargs = scheduler_factory(hyperparameter_tune_kwargs, num_trials=num_trials, nthreads_per_trial='auto', ngpus_per_trial='auto')
File "/usr/local/lib/python3.7/site-packages/autogluon/core/scheduler/scheduler_factory.py", line 76, in scheduler_factory
raise ValueError(f"Required key 'scheduler' is not present in hyperparameter_tune_kwargs: {hyperparameter_tune_kwargs}")
ValueError: Required key 'scheduler' is not present in hyperparameter_tune_kwargs: {'searcher': 'auto'}
Required key 'scheduler' is not present in hyperparameter_tune_kwargs: {'searcher': 'auto'}
Hyperparameter tuning model: XGBoost_3_BAG_L1 ... Tuning model for up to 5.0s of the 599.69s of remaining time.
Warning: Exception caused XGBoost_3_BAG_L1 to fail during hyperparameter tuning... Skipping this model.
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/autogluon/core/trainer/abstract_trainer.py", line 1385, in _train_single_full
**model_fit_kwargs
File "/usr/local/lib/python3.7/site-packages/autogluon/core/models/abstract/abstract_model.py", line 996, in hyperparameter_tune
hpo_executor.initialize(hyperparameter_tune_kwargs, default_num_trials=default_num_trials, time_limit=time_limit)
File "/usr/local/lib/python3.7/site-packages/autogluon/core/hpo/executors.py", line 318, in initialize
hyperparameter_tune_kwargs = scheduler_factory(hyperparameter_tune_kwargs, num_trials=num_trials, nthreads_per_trial='auto', ngpus_per_trial='auto')
File "/usr/local/lib/python3.7/site-packages/autogluon/core/scheduler/scheduler_factory.py", line 76, in scheduler_factory
raise ValueError(f"Required key 'scheduler' is not present in hyperparameter_tune_kwargs: {hyperparameter_tune_kwargs}")
ValueError: Required key 'scheduler' is not present in hyperparameter_tune_kwargs: {'searcher': 'auto'}
Required key 'scheduler' is not present in hyperparameter_tune_kwargs: {'searcher': 'auto'}
Hyperparameter tuning model: XGBoost_4_BAG_L1 ... Tuning model for up to 5.0s of the 599.68s of remaining time.
Warning: Exception caused XGBoost_4_BAG_L1 to fail during hyperparameter tuning... Skipping this model.
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/autogluon/core/trainer/abstract_trainer.py", line 1385, in _train_single_full
**model_fit_kwargs
File "/usr/local/lib/python3.7/site-packages/autogluon/core/models/abstract/abstract_model.py", line 996, in hyperparameter_tune
hpo_executor.initialize(hyperparameter_tune_kwargs, default_num_trials=default_num_trials, time_limit=time_limit)
File "/usr/local/lib/python3.7/site-packages/autogluon/core/hpo/executors.py", line 318, in initialize
hyperparameter_tune_kwargs = scheduler_factory(hyperparameter_tune_kwargs, num_trials=num_trials, nthreads_per_trial='auto', ngpus_per_trial='auto')
File "/usr/local/lib/python3.7/site-packages/autogluon/core/scheduler/scheduler_factory.py", line 76, in scheduler_factory
raise ValueError(f"Required key 'scheduler' is not present in hyperparameter_tune_kwargs: {hyperparameter_tune_kwargs}")
ValueError: Required key 'scheduler' is not present in hyperparameter_tune_kwargs: {'searcher': 'auto'}
Required key 'scheduler' is not present in hyperparameter_tune_kwargs: {'searcher': 'auto'}
Hyperparameter tuning model: XGBoost_5_BAG_L1 ... Tuning model for up to 5.0s of the 599.66s of remaining time.
Warning: Exception caused XGBoost_5_BAG_L1 to fail during hyperparameter tuning... Skipping this model.
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/autogluon/core/trainer/abstract_trainer.py", line 1385, in _train_single_full
**model_fit_kwargs
File "/usr/local/lib/python3.7/site-packages/autogluon/core/models/abstract/abstract_model.py", line 996, in hyperparameter_tune
hpo_executor.initialize(hyperparameter_tune_kwargs, default_num_trials=default_num_trials, time_limit=time_limit)
File "/usr/local/lib/python3.7/site-packages/autogluon/core/hpo/executors.py", line 318, in initialize
hyperparameter_tune_kwargs = scheduler_factory(hyperparameter_tune_kwargs, num_trials=num_trials, nthreads_per_trial='auto', ngpus_per_trial='auto')
File "/usr/local/lib/python3.7/site-packages/autogluon/core/scheduler/scheduler_factory.py", line 76, in scheduler_factory
raise ValueError(f"Required key 'scheduler' is not present in hyperparameter_tune_kwargs: {hyperparameter_tune_kwargs}")
ValueError: Required key 'scheduler' is not present in hyperparameter_tune_kwargs: {'searcher': 'auto'}
Required key 'scheduler' is not present in hyperparameter_tune_kwargs: {'searcher': 'auto'}
Completed 1/20 k-fold bagging repeats ...
No base models to train on, skipping auxiliary stack level 2...
No base models to train on, skipping stack level 2...
No base models to train on, skipping auxiliary stack level 3...
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-31-c3b3a4e44df2> in <module> 25 predictor_new_hpo = TabularPredictor(label="count", problem_type="regression", eval_metric="root_mean_squared_error").fit( 26 train_data=train_new.loc[:, train_new.columns.difference(["casual","registered"])], time_limit=600, presets="best_quality", ---> 27 hyperparameters=hyperparameters, hyperparameter_tune_kwargs=hyperparameter_tune_kwargs) /usr/local/lib/python3.7/site-packages/autogluon/core/utils/decorators.py in _call(*args, **kwargs) 28 def _call(*args, **kwargs): 29 gargs, gkwargs = g(*other_args, *args, **kwargs) ---> 30 return f(*gargs, **gkwargs) 31 return _call 32 return _unpack_inner /usr/local/lib/python3.7/site-packages/autogluon/tabular/predictor/predictor.py in fit(self, train_data, tuning_data, time_limit, presets, hyperparameters, feature_metadata, infer_limit, infer_limit_batch_size, **kwargs) 834 hyperparameters=hyperparameters, core_kwargs=core_kwargs, 835 time_limit=time_limit, infer_limit=infer_limit, infer_limit_batch_size=infer_limit_batch_size, --> 836 verbosity=verbosity, use_bag_holdout=use_bag_holdout) 837 self._set_post_fit_vars() 838 /usr/local/lib/python3.7/site-packages/autogluon/tabular/learner/abstract_learner.py in fit(self, X, X_val, **kwargs) 116 raise AssertionError('Learner is already fit.') 117 self._validate_fit_input(X=X, X_val=X_val, **kwargs) --> 118 return self._fit(X=X, X_val=X_val, **kwargs) 119 120 def _fit(self, X: DataFrame, X_val: DataFrame = None, scheduler_options=None, hyperparameter_tune=False, /usr/local/lib/python3.7/site-packages/autogluon/tabular/learner/default_learner.py in _fit(self, X, X_val, X_unlabeled, holdout_frac, num_bag_folds, num_bag_sets, time_limit, infer_limit, infer_limit_batch_size, verbosity, **trainer_fit_kwargs) 135 infer_limit_batch_size=infer_limit_batch_size, 136 groups=groups, --> 137 **trainer_fit_kwargs 138 ) 139 self.save_trainer(trainer=trainer) /usr/local/lib/python3.7/site-packages/autogluon/tabular/trainer/auto_trainer.py in fit(self, X, y, hyperparameters, X_val, y_val, X_unlabeled, holdout_frac, num_stack_levels, core_kwargs, time_limit, infer_limit, infer_limit_batch_size, use_bag_holdout, groups, **kwargs) 94 infer_limit=infer_limit, 95 infer_limit_batch_size=infer_limit_batch_size, ---> 96 groups=groups) 97 98 def construct_model_templates_distillation(self, hyperparameters, **kwargs): /usr/local/lib/python3.7/site-packages/autogluon/core/trainer/abstract_trainer.py in _train_multi_and_ensemble(self, X, y, X_val, y_val, hyperparameters, X_unlabeled, num_stack_levels, time_limit, groups, **kwargs) 1666 X_unlabeled=X_unlabeled, level_start=1, level_end=num_stack_levels+1, time_limit=time_limit, **kwargs) 1667 if len(self.get_model_names()) == 0: -> 1668 raise ValueError('AutoGluon did not successfully train any models') 1669 return model_names_fit 1670 ValueError: AutoGluon did not successfully train any models
predictor_new_hpo.fit_summary()
*** Summary of fit() ***
Estimated performance of each model:
model score_val pred_time_val fit_time pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order
0 WeightedEnsemble_L3 -32.616308 8.485239 370.376266 0.001056 0.431738 3 True 43
1 WeightedEnsemble_L2 -32.739922 6.628335 189.298007 0.000808 0.767611 2 True 30
2 LightGBM_BAG_L2/T1 -32.762137 8.316424 340.986121 0.134214 17.128103 2 True 31
3 XGBoost_2_BAG_L2/T1 -32.880989 8.349969 352.816425 0.167759 28.958407 2 True 36
4 LightGBM_2_BAG_L2/T1 -33.223683 8.356585 354.616779 0.174376 30.758761 2 True 33
5 LightGBM_3_BAG_L2/T1 -33.446330 8.339879 358.925689 0.157669 35.067671 2 True 34
6 LightGBM_BAG_L2/T2 -33.530842 8.301745 342.932317 0.119535 19.074299 2 True 32
7 LightGBM_4_BAG_L2/T1 -33.556592 8.404984 367.835747 0.222774 43.977730 2 True 35
8 LightGBM_2_BAG_L1/T1 -33.820572 2.035105 38.132162 2.035105 38.132162 1 True 7
9 LightGBM_3_BAG_L1/T1 -33.824605 1.926668 38.537019 1.926668 38.537019 1 True 8
10 LightGBM_4_BAG_L1/T1 -34.112016 1.739962 40.472843 1.739962 40.472843 1 True 9
11 XGBoost_2_BAG_L1/T1 -34.529856 0.567162 40.906013 0.567162 40.906013 1 True 10
12 XGBoost_5_BAG_L1/T4 -35.439059 0.016123 1.317936 0.016123 1.317936 1 True 29
13 XGBoost_4_BAG_L1/T4 -35.439059 0.016234 1.270561 0.016234 1.270561 1 True 25
14 LightGBM_BAG_L1/T2 -35.590010 0.187944 16.445438 0.187944 16.445438 1 True 2
15 XGBoost_3_BAG_L2/T1 -35.790098 8.195417 324.409641 0.013207 0.551623 2 True 37
16 XGBoost_5_BAG_L2/T1 -35.790098 8.197323 325.094311 0.015113 1.236293 2 True 42
17 XGBoost_4_BAG_L2/T1 -35.790098 8.197680 324.728979 0.015471 0.870961 2 True 40
18 XGBoost_5_BAG_L1/T1 -35.867266 0.014343 0.551025 0.014343 0.551025 1 True 26
19 XGBoost_4_BAG_L2/T2 -35.923375 8.196890 324.720822 0.014681 0.862804 2 True 41
20 LightGBM_BAG_L1/T5 -36.364213 0.159667 16.055488 0.159667 16.055488 1 True 5
21 XGBoost_3_BAG_L1/T4 -36.985329 0.170686 14.036922 0.170686 14.036922 1 True 14
22 XGBoost_4_BAG_L1/T1 -37.171439 0.013148 0.494805 0.013148 0.494805 1 True 22
23 LightGBM_BAG_L1/T3 -38.964909 0.201738 16.041790 0.201738 16.041790 1 True 3
24 XGBoost_5_BAG_L1/T2 -39.877508 0.014994 0.627352 0.014994 0.627352 1 True 27
25 LightGBM_BAG_L1/T1 -41.324754 0.138658 16.239201 0.138658 16.239201 1 True 1
26 XGBoost_3_BAG_L1/T1 -42.183714 0.149132 11.119115 0.149132 11.119115 1 True 11
27 XGBoost_3_BAG_L2/T2 -44.398416 8.194961 324.290455 0.012751 0.432437 2 True 38
28 XGBoost_4_BAG_L1/T2 -45.036694 0.012968 0.408097 0.012968 0.408097 1 True 23
29 XGBoost_3_BAG_L1/T11 -46.565790 0.010453 0.196033 0.010453 0.196033 1 True 21
30 XGBoost_5_BAG_L1/T3 -46.809150 0.019303 1.483124 0.019303 1.483124 1 True 28
31 LightGBM_BAG_L1/T6 -57.139249 0.148825 16.597632 0.148825 16.597632 1 True 6
32 XGBoost_4_BAG_L1/T3 -61.054729 0.015561 0.937454 0.015561 0.937454 1 True 24
33 XGBoost_3_BAG_L1/T7 -65.839719 0.010491 0.184709 0.010491 0.184709 1 True 17
34 XGBoost_3_BAG_L1/T2 -70.802462 0.140320 10.417979 0.140320 10.417979 1 True 12
35 XGBoost_3_BAG_L2/T3 -71.821258 8.196570 324.537438 0.014360 0.679420 2 True 39
36 XGBoost_3_BAG_L1/T5 -81.053063 0.122843 9.956612 0.122843 9.956612 1 True 15
37 XGBoost_3_BAG_L1/T3 -106.663599 0.221608 12.177471 0.221608 12.177471 1 True 13
38 LightGBM_BAG_L1/T4 -112.509639 0.134455 16.921681 0.134455 16.921681 1 True 4
39 XGBoost_3_BAG_L1/T6 -121.451404 0.137436 9.800652 0.137436 9.800652 1 True 16
40 XGBoost_3_BAG_L1/T8 -152.508632 0.010308 0.212872 0.010308 0.212872 1 True 18
41 XGBoost_3_BAG_L1/T9 -172.894902 0.012863 0.237206 0.012863 0.237206 1 True 19
42 XGBoost_3_BAG_L1/T10 -198.856425 0.011681 0.351543 0.011681 0.351543 1 True 20
Number of models trained: 43
Types of models trained:
{'StackerEnsembleModel_XGBoost', 'WeightedEnsembleModel', 'StackerEnsembleModel_LGB'}
Bagging used: True (with 8 folds)
Multi-layer stack-ensembling used: True (with 3 levels)
Feature Metadata (Processed):
(raw dtype, special dtypes):
('category', []) : 2 | ['season', 'weather']
('float', []) : 3 | ['atemp', 'temp', 'windspeed']
('int', []) : 4 | ['day', 'hour', 'humidity', 'month']
('int', ['bool']) : 3 | ['holiday', 'workingday', 'year']
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
Plot summary of models saved to file: AutogluonModels/ag-20221023_020336/SummaryOfModels.html
*** End of fit() summary ***
{'model_types': {'LightGBM_BAG_L1/T1': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L1/T2': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L1/T3': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L1/T4': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L1/T5': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L1/T6': 'StackerEnsembleModel_LGB',
'LightGBM_2_BAG_L1/T1': 'StackerEnsembleModel_LGB',
'LightGBM_3_BAG_L1/T1': 'StackerEnsembleModel_LGB',
'LightGBM_4_BAG_L1/T1': 'StackerEnsembleModel_LGB',
'XGBoost_2_BAG_L1/T1': 'StackerEnsembleModel_XGBoost',
'XGBoost_3_BAG_L1/T1': 'StackerEnsembleModel_XGBoost',
'XGBoost_3_BAG_L1/T2': 'StackerEnsembleModel_XGBoost',
'XGBoost_3_BAG_L1/T3': 'StackerEnsembleModel_XGBoost',
'XGBoost_3_BAG_L1/T4': 'StackerEnsembleModel_XGBoost',
'XGBoost_3_BAG_L1/T5': 'StackerEnsembleModel_XGBoost',
'XGBoost_3_BAG_L1/T6': 'StackerEnsembleModel_XGBoost',
'XGBoost_3_BAG_L1/T7': 'StackerEnsembleModel_XGBoost',
'XGBoost_3_BAG_L1/T8': 'StackerEnsembleModel_XGBoost',
'XGBoost_3_BAG_L1/T9': 'StackerEnsembleModel_XGBoost',
'XGBoost_3_BAG_L1/T10': 'StackerEnsembleModel_XGBoost',
'XGBoost_3_BAG_L1/T11': 'StackerEnsembleModel_XGBoost',
'XGBoost_4_BAG_L1/T1': 'StackerEnsembleModel_XGBoost',
'XGBoost_4_BAG_L1/T2': 'StackerEnsembleModel_XGBoost',
'XGBoost_4_BAG_L1/T3': 'StackerEnsembleModel_XGBoost',
'XGBoost_4_BAG_L1/T4': 'StackerEnsembleModel_XGBoost',
'XGBoost_5_BAG_L1/T1': 'StackerEnsembleModel_XGBoost',
'XGBoost_5_BAG_L1/T2': 'StackerEnsembleModel_XGBoost',
'XGBoost_5_BAG_L1/T3': 'StackerEnsembleModel_XGBoost',
'XGBoost_5_BAG_L1/T4': 'StackerEnsembleModel_XGBoost',
'WeightedEnsemble_L2': 'WeightedEnsembleModel',
'LightGBM_BAG_L2/T1': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L2/T2': 'StackerEnsembleModel_LGB',
'LightGBM_2_BAG_L2/T1': 'StackerEnsembleModel_LGB',
'LightGBM_3_BAG_L2/T1': 'StackerEnsembleModel_LGB',
'LightGBM_4_BAG_L2/T1': 'StackerEnsembleModel_LGB',
'XGBoost_2_BAG_L2/T1': 'StackerEnsembleModel_XGBoost',
'XGBoost_3_BAG_L2/T1': 'StackerEnsembleModel_XGBoost',
'XGBoost_3_BAG_L2/T2': 'StackerEnsembleModel_XGBoost',
'XGBoost_3_BAG_L2/T3': 'StackerEnsembleModel_XGBoost',
'XGBoost_4_BAG_L2/T1': 'StackerEnsembleModel_XGBoost',
'XGBoost_4_BAG_L2/T2': 'StackerEnsembleModel_XGBoost',
'XGBoost_5_BAG_L2/T1': 'StackerEnsembleModel_XGBoost',
'WeightedEnsemble_L3': 'WeightedEnsembleModel'},
'model_performance': {'LightGBM_BAG_L1/T1': -41.324753683080274,
'LightGBM_BAG_L1/T2': -35.59000961582356,
'LightGBM_BAG_L1/T3': -38.96490884412152,
'LightGBM_BAG_L1/T4': -112.50963897387132,
'LightGBM_BAG_L1/T5': -36.36421309556182,
'LightGBM_BAG_L1/T6': -57.13924895242837,
'LightGBM_2_BAG_L1/T1': -33.82057213457704,
'LightGBM_3_BAG_L1/T1': -33.824604698244464,
'LightGBM_4_BAG_L1/T1': -34.112015577540326,
'XGBoost_2_BAG_L1/T1': -34.52985570096839,
'XGBoost_3_BAG_L1/T1': -42.18371382724795,
'XGBoost_3_BAG_L1/T2': -70.80246208732468,
'XGBoost_3_BAG_L1/T3': -106.66359894276964,
'XGBoost_3_BAG_L1/T4': -36.985328592087086,
'XGBoost_3_BAG_L1/T5': -81.05306303122312,
'XGBoost_3_BAG_L1/T6': -121.45140392616798,
'XGBoost_3_BAG_L1/T7': -65.83971903826267,
'XGBoost_3_BAG_L1/T8': -152.50863166881652,
'XGBoost_3_BAG_L1/T9': -172.89490195050985,
'XGBoost_3_BAG_L1/T10': -198.85642528769088,
'XGBoost_3_BAG_L1/T11': -46.56579010090366,
'XGBoost_4_BAG_L1/T1': -37.17143912298946,
'XGBoost_4_BAG_L1/T2': -45.036693586675085,
'XGBoost_4_BAG_L1/T3': -61.05472874837559,
'XGBoost_4_BAG_L1/T4': -35.43905891072602,
'XGBoost_5_BAG_L1/T1': -35.86726614774864,
'XGBoost_5_BAG_L1/T2': -39.87750845041545,
'XGBoost_5_BAG_L1/T3': -46.80915033674734,
'XGBoost_5_BAG_L1/T4': -35.43905891072602,
'WeightedEnsemble_L2': -32.73992241379296,
'LightGBM_BAG_L2/T1': -32.76213725311967,
'LightGBM_BAG_L2/T2': -33.53084174739725,
'LightGBM_2_BAG_L2/T1': -33.22368257729854,
'LightGBM_3_BAG_L2/T1': -33.44633034585296,
'LightGBM_4_BAG_L2/T1': -33.556591604696365,
'XGBoost_2_BAG_L2/T1': -32.880989383934335,
'XGBoost_3_BAG_L2/T1': -35.79009780560234,
'XGBoost_3_BAG_L2/T2': -44.39841559511542,
'XGBoost_3_BAG_L2/T3': -71.8212581695465,
'XGBoost_4_BAG_L2/T1': -35.79009780560234,
'XGBoost_4_BAG_L2/T2': -35.923374701289305,
'XGBoost_5_BAG_L2/T1': -35.79009780560234,
'WeightedEnsemble_L3': -32.616307536991854},
'model_best': 'WeightedEnsemble_L3',
'model_paths': {'LightGBM_BAG_L1/T1': 'AutogluonModels/ag-20221023_020336/models/LightGBM_BAG_L1/T1/',
'LightGBM_BAG_L1/T2': 'AutogluonModels/ag-20221023_020336/models/LightGBM_BAG_L1/T2/',
'LightGBM_BAG_L1/T3': 'AutogluonModels/ag-20221023_020336/models/LightGBM_BAG_L1/T3/',
'LightGBM_BAG_L1/T4': 'AutogluonModels/ag-20221023_020336/models/LightGBM_BAG_L1/T4/',
'LightGBM_BAG_L1/T5': 'AutogluonModels/ag-20221023_020336/models/LightGBM_BAG_L1/T5/',
'LightGBM_BAG_L1/T6': 'AutogluonModels/ag-20221023_020336/models/LightGBM_BAG_L1/T6/',
'LightGBM_2_BAG_L1/T1': 'AutogluonModels/ag-20221023_020336/models/LightGBM_2_BAG_L1/T1/',
'LightGBM_3_BAG_L1/T1': 'AutogluonModels/ag-20221023_020336/models/LightGBM_3_BAG_L1/T1/',
'LightGBM_4_BAG_L1/T1': 'AutogluonModels/ag-20221023_020336/models/LightGBM_4_BAG_L1/T1/',
'XGBoost_2_BAG_L1/T1': 'AutogluonModels/ag-20221023_020336/models/XGBoost_2_BAG_L1/T1/',
'XGBoost_3_BAG_L1/T1': 'AutogluonModels/ag-20221023_020336/models/XGBoost_3_BAG_L1/T1/',
'XGBoost_3_BAG_L1/T2': 'AutogluonModels/ag-20221023_020336/models/XGBoost_3_BAG_L1/T2/',
'XGBoost_3_BAG_L1/T3': 'AutogluonModels/ag-20221023_020336/models/XGBoost_3_BAG_L1/T3/',
'XGBoost_3_BAG_L1/T4': 'AutogluonModels/ag-20221023_020336/models/XGBoost_3_BAG_L1/T4/',
'XGBoost_3_BAG_L1/T5': 'AutogluonModels/ag-20221023_020336/models/XGBoost_3_BAG_L1/T5/',
'XGBoost_3_BAG_L1/T6': 'AutogluonModels/ag-20221023_020336/models/XGBoost_3_BAG_L1/T6/',
'XGBoost_3_BAG_L1/T7': 'AutogluonModels/ag-20221023_020336/models/XGBoost_3_BAG_L1/T7/',
'XGBoost_3_BAG_L1/T8': 'AutogluonModels/ag-20221023_020336/models/XGBoost_3_BAG_L1/T8/',
'XGBoost_3_BAG_L1/T9': 'AutogluonModels/ag-20221023_020336/models/XGBoost_3_BAG_L1/T9/',
'XGBoost_3_BAG_L1/T10': 'AutogluonModels/ag-20221023_020336/models/XGBoost_3_BAG_L1/T10/',
'XGBoost_3_BAG_L1/T11': 'AutogluonModels/ag-20221023_020336/models/XGBoost_3_BAG_L1/T11/',
'XGBoost_4_BAG_L1/T1': 'AutogluonModels/ag-20221023_020336/models/XGBoost_4_BAG_L1/T1/',
'XGBoost_4_BAG_L1/T2': 'AutogluonModels/ag-20221023_020336/models/XGBoost_4_BAG_L1/T2/',
'XGBoost_4_BAG_L1/T3': 'AutogluonModels/ag-20221023_020336/models/XGBoost_4_BAG_L1/T3/',
'XGBoost_4_BAG_L1/T4': 'AutogluonModels/ag-20221023_020336/models/XGBoost_4_BAG_L1/T4/',
'XGBoost_5_BAG_L1/T1': 'AutogluonModels/ag-20221023_020336/models/XGBoost_5_BAG_L1/T1/',
'XGBoost_5_BAG_L1/T2': 'AutogluonModels/ag-20221023_020336/models/XGBoost_5_BAG_L1/T2/',
'XGBoost_5_BAG_L1/T3': 'AutogluonModels/ag-20221023_020336/models/XGBoost_5_BAG_L1/T3/',
'XGBoost_5_BAG_L1/T4': 'AutogluonModels/ag-20221023_020336/models/XGBoost_5_BAG_L1/T4/',
'WeightedEnsemble_L2': 'AutogluonModels/ag-20221023_020336/models/WeightedEnsemble_L2/',
'LightGBM_BAG_L2/T1': 'AutogluonModels/ag-20221023_020336/models/LightGBM_BAG_L2/T1/',
'LightGBM_BAG_L2/T2': 'AutogluonModels/ag-20221023_020336/models/LightGBM_BAG_L2/T2/',
'LightGBM_2_BAG_L2/T1': 'AutogluonModels/ag-20221023_020336/models/LightGBM_2_BAG_L2/T1/',
'LightGBM_3_BAG_L2/T1': 'AutogluonModels/ag-20221023_020336/models/LightGBM_3_BAG_L2/T1/',
'LightGBM_4_BAG_L2/T1': 'AutogluonModels/ag-20221023_020336/models/LightGBM_4_BAG_L2/T1/',
'XGBoost_2_BAG_L2/T1': 'AutogluonModels/ag-20221023_020336/models/XGBoost_2_BAG_L2/T1/',
'XGBoost_3_BAG_L2/T1': 'AutogluonModels/ag-20221023_020336/models/XGBoost_3_BAG_L2/T1/',
'XGBoost_3_BAG_L2/T2': 'AutogluonModels/ag-20221023_020336/models/XGBoost_3_BAG_L2/T2/',
'XGBoost_3_BAG_L2/T3': 'AutogluonModels/ag-20221023_020336/models/XGBoost_3_BAG_L2/T3/',
'XGBoost_4_BAG_L2/T1': 'AutogluonModels/ag-20221023_020336/models/XGBoost_4_BAG_L2/T1/',
'XGBoost_4_BAG_L2/T2': 'AutogluonModels/ag-20221023_020336/models/XGBoost_4_BAG_L2/T2/',
'XGBoost_5_BAG_L2/T1': 'AutogluonModels/ag-20221023_020336/models/XGBoost_5_BAG_L2/T1/',
'WeightedEnsemble_L3': 'AutogluonModels/ag-20221023_020336/models/WeightedEnsemble_L3/'},
'model_fit_times': {'LightGBM_BAG_L1/T1': 16.239201068878174,
'LightGBM_BAG_L1/T2': 16.445437908172607,
'LightGBM_BAG_L1/T3': 16.041790008544922,
'LightGBM_BAG_L1/T4': 16.92168140411377,
'LightGBM_BAG_L1/T5': 16.055487871170044,
'LightGBM_BAG_L1/T6': 16.59763240814209,
'LightGBM_2_BAG_L1/T1': 38.13216161727905,
'LightGBM_3_BAG_L1/T1': 38.5370192527771,
'LightGBM_4_BAG_L1/T1': 40.472843170166016,
'XGBoost_2_BAG_L1/T1': 40.906012773513794,
'XGBoost_3_BAG_L1/T1': 11.119115114212036,
'XGBoost_3_BAG_L1/T2': 10.41797924041748,
'XGBoost_3_BAG_L1/T3': 12.177471160888672,
'XGBoost_3_BAG_L1/T4': 14.036921501159668,
'XGBoost_3_BAG_L1/T5': 9.956611633300781,
'XGBoost_3_BAG_L1/T6': 9.800651550292969,
'XGBoost_3_BAG_L1/T7': 0.1847093105316162,
'XGBoost_3_BAG_L1/T8': 0.21287155151367188,
'XGBoost_3_BAG_L1/T9': 0.23720574378967285,
'XGBoost_3_BAG_L1/T10': 0.35154271125793457,
'XGBoost_3_BAG_L1/T11': 0.19603323936462402,
'XGBoost_4_BAG_L1/T1': 0.49480533599853516,
'XGBoost_4_BAG_L1/T2': 0.4080970287322998,
'XGBoost_4_BAG_L1/T3': 0.9374542236328125,
'XGBoost_4_BAG_L1/T4': 1.2705605030059814,
'XGBoost_5_BAG_L1/T1': 0.5510251522064209,
'XGBoost_5_BAG_L1/T2': 0.6273515224456787,
'XGBoost_5_BAG_L1/T3': 1.4831244945526123,
'XGBoost_5_BAG_L1/T4': 1.3179364204406738,
'WeightedEnsemble_L2': 0.767611026763916,
'LightGBM_BAG_L2/T1': 17.128103494644165,
'LightGBM_BAG_L2/T2': 19.074299335479736,
'LightGBM_2_BAG_L2/T1': 30.758761405944824,
'LightGBM_3_BAG_L2/T1': 35.067671060562134,
'LightGBM_4_BAG_L2/T1': 43.9777295589447,
'XGBoost_2_BAG_L2/T1': 28.958407402038574,
'XGBoost_3_BAG_L2/T1': 0.5516231060028076,
'XGBoost_3_BAG_L2/T2': 0.4324371814727783,
'XGBoost_3_BAG_L2/T3': 0.6794204711914062,
'XGBoost_4_BAG_L2/T1': 0.8709614276885986,
'XGBoost_4_BAG_L2/T2': 0.8628041744232178,
'XGBoost_5_BAG_L2/T1': 1.236293077468872,
'WeightedEnsemble_L3': 0.43173789978027344},
'model_pred_times': {'LightGBM_BAG_L1/T1': 0.138657808303833,
'LightGBM_BAG_L1/T2': 0.1879441738128662,
'LightGBM_BAG_L1/T3': 0.2017383575439453,
'LightGBM_BAG_L1/T4': 0.13445544242858887,
'LightGBM_BAG_L1/T5': 0.1596670150756836,
'LightGBM_BAG_L1/T6': 0.14882540702819824,
'LightGBM_2_BAG_L1/T1': 2.035104990005493,
'LightGBM_3_BAG_L1/T1': 1.926668405532837,
'LightGBM_4_BAG_L1/T1': 1.739962100982666,
'XGBoost_2_BAG_L1/T1': 0.5671615600585938,
'XGBoost_3_BAG_L1/T1': 0.14913177490234375,
'XGBoost_3_BAG_L1/T2': 0.14031982421875,
'XGBoost_3_BAG_L1/T3': 0.22160816192626953,
'XGBoost_3_BAG_L1/T4': 0.1706860065460205,
'XGBoost_3_BAG_L1/T5': 0.12284255027770996,
'XGBoost_3_BAG_L1/T6': 0.1374359130859375,
'XGBoost_3_BAG_L1/T7': 0.010491132736206055,
'XGBoost_3_BAG_L1/T8': 0.010307788848876953,
'XGBoost_3_BAG_L1/T9': 0.0128631591796875,
'XGBoost_3_BAG_L1/T10': 0.011681079864501953,
'XGBoost_3_BAG_L1/T11': 0.010452508926391602,
'XGBoost_4_BAG_L1/T1': 0.013148069381713867,
'XGBoost_4_BAG_L1/T2': 0.012968063354492188,
'XGBoost_4_BAG_L1/T3': 0.015561103820800781,
'XGBoost_4_BAG_L1/T4': 0.01623392105102539,
'XGBoost_5_BAG_L1/T1': 0.014343023300170898,
'XGBoost_5_BAG_L1/T2': 0.014993667602539062,
'XGBoost_5_BAG_L1/T3': 0.019303083419799805,
'XGBoost_5_BAG_L1/T4': 0.016123056411743164,
'WeightedEnsemble_L2': 0.0008077621459960938,
'LightGBM_BAG_L2/T1': 0.1342144012451172,
'LightGBM_BAG_L2/T2': 0.11953520774841309,
'LightGBM_2_BAG_L2/T1': 0.1743755340576172,
'LightGBM_3_BAG_L2/T1': 0.1576690673828125,
'LightGBM_4_BAG_L2/T1': 0.22277402877807617,
'XGBoost_2_BAG_L2/T1': 0.16775941848754883,
'XGBoost_3_BAG_L2/T1': 0.013207435607910156,
'XGBoost_3_BAG_L2/T2': 0.012751340866088867,
'XGBoost_3_BAG_L2/T3': 0.014360427856445312,
'XGBoost_4_BAG_L2/T1': 0.015470504760742188,
'XGBoost_4_BAG_L2/T2': 0.014680624008178711,
'XGBoost_5_BAG_L2/T1': 0.015113115310668945,
'WeightedEnsemble_L3': 0.0010557174682617188},
'num_bag_folds': 8,
'max_stack_level': 3,
'model_hyperparams': {'LightGBM_BAG_L1/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L1/T2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L1/T3': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L1/T4': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L1/T5': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L1/T6': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_2_BAG_L1/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_3_BAG_L1/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_4_BAG_L1/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'XGBoost_2_BAG_L1/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'XGBoost_3_BAG_L1/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'XGBoost_3_BAG_L1/T2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'XGBoost_3_BAG_L1/T3': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'XGBoost_3_BAG_L1/T4': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'XGBoost_3_BAG_L1/T5': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'XGBoost_3_BAG_L1/T6': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'XGBoost_3_BAG_L1/T7': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'XGBoost_3_BAG_L1/T8': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'XGBoost_3_BAG_L1/T9': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'XGBoost_3_BAG_L1/T10': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'XGBoost_3_BAG_L1/T11': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'XGBoost_4_BAG_L1/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'XGBoost_4_BAG_L1/T2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'XGBoost_4_BAG_L1/T3': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'XGBoost_4_BAG_L1/T4': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'XGBoost_5_BAG_L1/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'XGBoost_5_BAG_L1/T2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'XGBoost_5_BAG_L1/T3': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'XGBoost_5_BAG_L1/T4': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'WeightedEnsemble_L2': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L2/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L2/T2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_2_BAG_L2/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_3_BAG_L2/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_4_BAG_L2/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'XGBoost_2_BAG_L2/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'XGBoost_3_BAG_L2/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'XGBoost_3_BAG_L2/T2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'XGBoost_3_BAG_L2/T3': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'XGBoost_4_BAG_L2/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'XGBoost_4_BAG_L2/T2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'XGBoost_5_BAG_L2/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'WeightedEnsemble_L3': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True}},
'leaderboard': model score_val pred_time_val fit_time \
0 WeightedEnsemble_L3 -32.616308 8.485239 370.376266
1 WeightedEnsemble_L2 -32.739922 6.628335 189.298007
2 LightGBM_BAG_L2/T1 -32.762137 8.316424 340.986121
3 XGBoost_2_BAG_L2/T1 -32.880989 8.349969 352.816425
4 LightGBM_2_BAG_L2/T1 -33.223683 8.356585 354.616779
5 LightGBM_3_BAG_L2/T1 -33.446330 8.339879 358.925689
6 LightGBM_BAG_L2/T2 -33.530842 8.301745 342.932317
7 LightGBM_4_BAG_L2/T1 -33.556592 8.404984 367.835747
8 LightGBM_2_BAG_L1/T1 -33.820572 2.035105 38.132162
9 LightGBM_3_BAG_L1/T1 -33.824605 1.926668 38.537019
10 LightGBM_4_BAG_L1/T1 -34.112016 1.739962 40.472843
11 XGBoost_2_BAG_L1/T1 -34.529856 0.567162 40.906013
12 XGBoost_5_BAG_L1/T4 -35.439059 0.016123 1.317936
13 XGBoost_4_BAG_L1/T4 -35.439059 0.016234 1.270561
14 LightGBM_BAG_L1/T2 -35.590010 0.187944 16.445438
15 XGBoost_3_BAG_L2/T1 -35.790098 8.195417 324.409641
16 XGBoost_5_BAG_L2/T1 -35.790098 8.197323 325.094311
17 XGBoost_4_BAG_L2/T1 -35.790098 8.197680 324.728979
18 XGBoost_5_BAG_L1/T1 -35.867266 0.014343 0.551025
19 XGBoost_4_BAG_L2/T2 -35.923375 8.196890 324.720822
20 LightGBM_BAG_L1/T5 -36.364213 0.159667 16.055488
21 XGBoost_3_BAG_L1/T4 -36.985329 0.170686 14.036922
22 XGBoost_4_BAG_L1/T1 -37.171439 0.013148 0.494805
23 LightGBM_BAG_L1/T3 -38.964909 0.201738 16.041790
24 XGBoost_5_BAG_L1/T2 -39.877508 0.014994 0.627352
25 LightGBM_BAG_L1/T1 -41.324754 0.138658 16.239201
26 XGBoost_3_BAG_L1/T1 -42.183714 0.149132 11.119115
27 XGBoost_3_BAG_L2/T2 -44.398416 8.194961 324.290455
28 XGBoost_4_BAG_L1/T2 -45.036694 0.012968 0.408097
29 XGBoost_3_BAG_L1/T11 -46.565790 0.010453 0.196033
30 XGBoost_5_BAG_L1/T3 -46.809150 0.019303 1.483124
31 LightGBM_BAG_L1/T6 -57.139249 0.148825 16.597632
32 XGBoost_4_BAG_L1/T3 -61.054729 0.015561 0.937454
33 XGBoost_3_BAG_L1/T7 -65.839719 0.010491 0.184709
34 XGBoost_3_BAG_L1/T2 -70.802462 0.140320 10.417979
35 XGBoost_3_BAG_L2/T3 -71.821258 8.196570 324.537438
36 XGBoost_3_BAG_L1/T5 -81.053063 0.122843 9.956612
37 XGBoost_3_BAG_L1/T3 -106.663599 0.221608 12.177471
38 LightGBM_BAG_L1/T4 -112.509639 0.134455 16.921681
39 XGBoost_3_BAG_L1/T6 -121.451404 0.137436 9.800652
40 XGBoost_3_BAG_L1/T8 -152.508632 0.010308 0.212872
41 XGBoost_3_BAG_L1/T9 -172.894902 0.012863 0.237206
42 XGBoost_3_BAG_L1/T10 -198.856425 0.011681 0.351543
pred_time_val_marginal fit_time_marginal stack_level can_infer \
0 0.001056 0.431738 3 True
1 0.000808 0.767611 2 True
2 0.134214 17.128103 2 True
3 0.167759 28.958407 2 True
4 0.174376 30.758761 2 True
5 0.157669 35.067671 2 True
6 0.119535 19.074299 2 True
7 0.222774 43.977730 2 True
8 2.035105 38.132162 1 True
9 1.926668 38.537019 1 True
10 1.739962 40.472843 1 True
11 0.567162 40.906013 1 True
12 0.016123 1.317936 1 True
13 0.016234 1.270561 1 True
14 0.187944 16.445438 1 True
15 0.013207 0.551623 2 True
16 0.015113 1.236293 2 True
17 0.015471 0.870961 2 True
18 0.014343 0.551025 1 True
19 0.014681 0.862804 2 True
20 0.159667 16.055488 1 True
21 0.170686 14.036922 1 True
22 0.013148 0.494805 1 True
23 0.201738 16.041790 1 True
24 0.014994 0.627352 1 True
25 0.138658 16.239201 1 True
26 0.149132 11.119115 1 True
27 0.012751 0.432437 2 True
28 0.012968 0.408097 1 True
29 0.010453 0.196033 1 True
30 0.019303 1.483124 1 True
31 0.148825 16.597632 1 True
32 0.015561 0.937454 1 True
33 0.010491 0.184709 1 True
34 0.140320 10.417979 1 True
35 0.014360 0.679420 2 True
36 0.122843 9.956612 1 True
37 0.221608 12.177471 1 True
38 0.134455 16.921681 1 True
39 0.137436 9.800652 1 True
40 0.010308 0.212872 1 True
41 0.012863 0.237206 1 True
42 0.011681 0.351543 1 True
fit_order
0 43
1 30
2 31
3 36
4 33
5 34
6 32
7 35
8 7
9 8
10 9
11 10
12 29
13 25
14 2
15 37
16 42
17 40
18 26
19 41
20 5
21 14
22 22
23 3
24 27
25 1
26 11
27 38
28 23
29 21
30 28
31 6
32 24
33 17
34 12
35 39
36 15
37 13
38 4
39 16
40 18
41 19
42 20 }
predictor_new_hpo.leaderboard(silent=True).plot(kind="bar", x="model", y="score_val")
<AxesSubplot:xlabel='model'>
# Remember to set all negative values to zero
test_new["count"] = 0
performance_new_hpo = predictor_new_hpo.evaluate(test_new)
print("The performance indicators are : \n", performance_new_hpo)
/usr/local/lib/python3.7/site-packages/scipy/stats/stats.py:4023: PearsonRConstantInputWarning: An input array is constant; the correlation coefficient is not defined.
warnings.warn(PearsonRConstantInputWarning())
Evaluation: root_mean_squared_error on test data: -257.90773442592734
Note: Scores are always higher_is_better. This metric score can be multiplied by -1 to get the metric value.
Evaluations on test data:
{
"root_mean_squared_error": -257.90773442592734,
"mean_squared_error": -66516.39947671468,
"mean_absolute_error": -190.9576504444573,
"r2": 0.0,
"pearsonr": NaN,
"median_absolute_error": -147.15931701660156
}
The performance indicators are :
{'root_mean_squared_error': -257.90773442592734, 'mean_squared_error': -66516.39947671468, 'mean_absolute_error': -190.9576504444573, 'r2': 0.0, 'pearsonr': nan, 'median_absolute_error': -147.15931701660156}
# Remember to set all negative values to zero
predictions_new_features = predictor_new_hpo.predict(test_new)
predictions_new_features
0 10.835062
1 7.043610
2 6.781248
3 6.762746
4 6.762746
...
6488 359.697021
6489 215.506653
6490 159.988556
6491 101.330910
6492 56.446304
Name: count, Length: 6493, dtype: float32
predictions_new_features[predictions_new_features<0]
Series([], Name: count, dtype: float32)
# Same submitting predictions
submission_new_hpo = submission
submission_new_hpo["count"] = predictions_new_features
submission_new_hpo.to_csv("submission_new_hpo.csv", index=False)
!kaggle competitions submit -c bike-sharing-demand -f submission_new_hpo.csv -m "new features with hyperparameter tuning of GBM and XGBoost"
100%|█████████████████████████████████████████| 188k/188k [00:00<00:00, 381kB/s] Successfully submitted to Bike Sharing Demand
!kaggle competitions submissions -c bike-sharing-demand | tail -n +1 | head -n 6
fileName date description status publicScore privateScore --------------------------- ------------------- ---------------------------------------------------------- -------- ----------- ------------ submission_new_hpo.csv 2022-10-23 02:55:42 new features with hyperparameter tuning of GBM and XGBoost complete 0.47866 0.47866 submission_new_hpo.csv 2022-10-23 02:24:32 new features with hyperparameter tuning of GBM and XGBoost complete 0.47866 0.47866 submission_new_hpo.csv 2022-10-19 22:30:48 new features with hyperparameters complete 0.48898 0.48898 submission_new_features.csv 2022-10-19 20:23:34 new features + set weather, holiday, season, workingday complete 1.80119 1.80119
from sklearn.ensemble import RandomForestRegressor
rf = RandomForestRegressor(n_estimators = 100, random_state = 2020)
#Et on lance le training sur notre dataset de train
rf.fit(train_new[train_new.columns.difference(['count','datetime'])], train_new.count)
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-28-83fc54d835ff> in <module> 3 rf = RandomForestRegressor(n_estimators = 100, random_state = 2020) 4 #Et on lance le training sur notre dataset de train ----> 5 rf.fit(train_new[train_new.columns.difference(['count','datetime'])], train_new.count) /usr/local/lib/python3.7/site-packages/sklearn/ensemble/_forest.py in fit(self, X, y, sample_weight) 326 raise ValueError("sparse multilabel-indicator for y is not supported.") 327 X, y = self._validate_data( --> 328 X, y, multi_output=True, accept_sparse="csc", dtype=DTYPE 329 ) 330 if sample_weight is not None: /usr/local/lib/python3.7/site-packages/sklearn/base.py in _validate_data(self, X, y, reset, validate_separately, **check_params) 574 y = check_array(y, **check_y_params) 575 else: --> 576 X, y = check_X_y(X, y, **check_params) 577 out = X, y 578 /usr/local/lib/python3.7/site-packages/sklearn/utils/validation.py in check_X_y(X, y, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, multi_output, ensure_min_samples, ensure_min_features, y_numeric, estimator) 969 ) 970 --> 971 y = _check_y(y, multi_output=multi_output, y_numeric=y_numeric) 972 973 check_consistent_length(X, y) /usr/local/lib/python3.7/site-packages/sklearn/utils/validation.py in _check_y(y, multi_output, y_numeric) 980 if multi_output: 981 y = check_array( --> 982 y, accept_sparse="csr", force_all_finite=True, ensure_2d=False, dtype=None 983 ) 984 else: /usr/local/lib/python3.7/site-packages/sklearn/utils/validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator) 793 794 if ensure_min_samples > 0: --> 795 n_samples = _num_samples(array) 796 if n_samples < ensure_min_samples: 797 raise ValueError( /usr/local/lib/python3.7/site-packages/sklearn/utils/validation.py in _num_samples(x) 267 if len(x.shape) == 0: 268 raise TypeError( --> 269 "Singleton array %r cannot be considered a valid collection." % x 270 ) 271 # Check that shape is returning an integer or default to len TypeError: Singleton array array(<bound method DataFrame.count of datetime season holiday workingday weather temp \ 0 2011-01-01 00:00:00 1 0 0 1 9.84 1 2011-01-01 01:00:00 1 0 0 1 9.02 2 2011-01-01 02:00:00 1 0 0 1 9.02 3 2011-01-01 03:00:00 1 0 0 1 9.84 4 2011-01-01 04:00:00 1 0 0 1 9.84 ... ... ... ... ... ... ... 10881 2012-12-19 19:00:00 4 0 1 1 15.58 10882 2012-12-19 20:00:00 4 0 1 1 14.76 10883 2012-12-19 21:00:00 4 0 1 1 13.94 10884 2012-12-19 22:00:00 4 0 1 1 13.94 10885 2012-12-19 23:00:00 4 0 1 1 13.12 atemp humidity windspeed casual registered count year month \ 0 14.395 81 0.0000 3 13 16 2011 1 1 13.635 80 0.0000 8 32 40 2011 1 2 13.635 80 0.0000 5 27 32 2011 1 3 14.395 75 0.0000 3 10 13 2011 1 4 14.395 75 0.0000 0 1 1 2011 1 ... ... ... ... ... ... ... ... ... 10881 19.695 50 26.0027 7 329 336 2012 12 10882 17.425 57 15.0013 10 231 241 2012 12 10883 15.910 61 15.0013 4 164 168 2012 12 10884 17.425 61 6.0032 12 117 129 2012 12 10885 16.665 66 8.9981 4 84 88 2012 12 day hour 0 1 0 1 1 1 2 1 2 3 1 3 4 1 4 ... ... ... 10881 19 19 10882 19 20 10883 19 21 10884 19 22 10885 19 23 [10886 rows x 16 columns]>, dtype=object) cannot be considered a valid collection.
0.47866¶# Taking the top model score from each training run and creating a line plot to show improvement
# You can create these in the notebook and save them to PNG or use some other tool (e.g. google sheets, excel)
fig = pd.DataFrame(
{
"model": ["initial", "add_features", "hpo1", "hpo2"],
"score": [ -52.7, -30.18, -32.3, -32.6]
}
).plot(x="model", y="score", figsize=(8, 6)).get_figure()
fig.savefig('img/model_train_score_project.png')
# Take the 3 kaggle scores and creating a line plot to show improvement
fig = pd.DataFrame(
{
"test_eval": ["initial", "add_features", "hpo1", "hpo2"],
"score": [1.809, 1.801, 0.49, 0.48]
}
).plot(x="test_eval", y="score", figsize=(8, 6)).get_figure()
fig.savefig('img/model_test_score_project.png')
# The 3 hyperparameters we tuned with the kaggle score as the result
pd.DataFrame({
"model": ["initial", "add_features", "hpo"],
"time_limit": [600, 600, 600],
"presets": ["best_quality", "best_quality", "best_quality"],
"hyperparameters": ['default', 'default', "{'GBM': gbm_config,'XGB': xgb_config}"],
"hyperparameter_tune_kwargs":["-", "auto", "{'searcher':'auto'}"],
"score": [1.809, 0.49, 0.48]
})
| model | time_limit | presets | hyperparameters | hyperparameter_tune_kwargs | score | |
|---|---|---|---|---|---|---|
| 0 | initial | 600 | best_quality | default | - | 1.809 |
| 1 | add_features | 600 | best_quality | default | auto | 0.490 |
| 2 | hpo | 600 | best_quality | {'GBM': gbm_config,'XGB': xgb_config} | {'searcher':'auto'} | 0.480 |